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5 NOVEL HCV NON-STRUCTURAL POLYPEPTIDE 

FIELD OF THE INVENTION 

The present invention relates to polypeptides comprising a mutant non- 
10 structural Hepatitis C virus ("HCV") polypeptide useful for immunogenic compounds 
for use against HCV, methods of preparing and using the same, and immimogenic 
compositions comprising the same. The present invention also relates to compositions 
comprising (a) a mutant non-structural HCV polypeptide and (b) a viral polypeptide 
that is not a non-structural HCV polypeptide and methods of using these compositions. 

15 

BACKGROUND OF THE INVENTION 

HCV is now recognized as the major agent of chronic hepatitis and liver disease 
worldwide. It is estimated that HCV infects about 400 million people worldwide, 
corresponding to more than 3% of the world population. 
20 Hepatitis C virus ("HCV") is a small enveloped RNA flavivirus, which contains 

a positive-stranded RNA genome of about 10 kilobases. The genome has a single 
uninterrupted ORF that encodes a protein of 3010-301 1 amino acids. The structural 
proteins of HCV include a core protein (C), which is highly immunogenic, as well as 
two envelope proteins (El and E2), which likely form a heterodimer in v/vo, and non- 
25 structural proteins NS2-NS5. It is known that the NS3 region of the virus is important 
for post-translational processing of the polyprotein into individual proteins, and the 
NS5 region encodes an RNA-dependant RNA polymerase. 

Virus-specific T lymphocytes, along with neutralizing antibodies, are the 
mainstay of the antiviral immune defense in established viral infections. Whereas 
30 CD8* cytotoxic T cells eliminate virus-infected-cells, CD4* T helper cells are essentifif 
for the efficient regulation of the antiviral immune response. CD4'^ T helper cells 
recognize specific antigens as peptides bound to autologous HLA class 11 molecules 
(viral antigens or particles are taken up by professional antigen-presenting cells, 
processed to peptides, bound to HLA class II molecules in the lysosomal compartment. 
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and transported back to the cell surface). Several observations support an important 
role of CD4^ T cells in the elimination of HCV infection. Tsai et al., 1997 Hepatology 
25:449-458; Diepolder et al 1995 Lancet 346: 1--6-1009; Missale et al 1996 JCI 98: 
706-714; Botarelli et al 1993; Gastro 104: 580-587; Diepolder et al 1997 J.Virol 71 : 
5 601 1 . Immunogenic peptides usually have a minimal length of 8-1 1 amino acids. 

However, since the peptide binding groove of HLA class II molecules seems to be open 
at both ends, longer peptides are tolerated. Thus peptides eluted from HLA class II 
molecules are typically in the range of 1 5-25 amino acids. HLA class II molecules are 
extremely polymorphic and each allele seems to have its individual requirements for 

1 0 peptide binding. Thus the HLA class II repertoire of a given individual determines 
which viral peptides can be presented to T cells. Recognition of the specific HLA- 
peptide complex by the T cell receptor accompanied by appropriate costimulatory 
signals lead to T cell activation, secretion of cytokines, and T cell proliferation. 

Numerous studies demonstrate that HLA Class II restricted CD4^ responses are 

1 5 determined by stimulating peripheral blood mononuclear cells with recombinant viral 
antigens or peptides. Botarelli et aiy (1993) Gastroenterology 104:580-587; Farrari et 
al, (1994) Hepatology 19:286-295; Minutello et al, (1993) C. J. Exp. Med. 178:17-25; 
Hoffinann et al, (1995) Hepatology 21:632-638; Iwata et al, (1995) Hepatology 
22:1057-1064; and Tsai.e^ al, (1995) Hepatology 21:908-912. 

20 Polyclonal multispecific CD8^ T cell responses have been detected in patients 

with chronic hepatitis C. Additionally, CD8^ CTL*s were shown to be important in 
resolving acute HCV infection in chimpanzees (Cooper et al. Immunity 1999). About 
50% of patients with chronic hepatitis C demonstrate a detectable virus-specific CD4"^ 
T cell response, which is most frequently directed against HCV core and/or NS4 and 

25 tends to be more common in patients who achieve sustained viral clearance during 
interferon-a therapy. 

Depending on the pattern of lymphokines, CD4^ T helper cells have been 
classified as THl, THO, or TH2. Cytokines of the THl type are typically EFN-y, 
lymphotoxin, and interleukin-2 (IL-2), which are believed to support activation of 

30 virus-specific CDS"^ T cells and natural killer cells. The TVH cytokines IL-4, IL-5, IL- 
10, and IL-13 are important for B cell activation and differentiation, thus inducing a 
himioral immune response. 
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During acute hepatitis C infection a strong and sustained THl/THO response to 
NS3 and possibly to other nonstructural proteins is associated with a self-limited course 
of the disease. Diapolder et aL, (1995) Lancet 346:1006-1007, showed all CD4^ T cell 
clones to have a THl or THO cytokine profile, suggesting that the clones support 
5 cytotoxic immune mechanisms in vivo. The majority of CD4^ T cell clones responded 
to a relatively short segment of NS3, namely amino acids 1207-1278, suggesting that 
this region of NS3 is immunodominant for CD4'^ T cells. More than 70% of those 
who contract HCV develop chronic infection and hepatitis, and a significant portion of 
them progress to cirrhosis and eventually hepatocellular carcinoma. The only approved 
10 therapy at present is a 6- to 12- month course of interferon a, which leads to sustained 
improvement in only 20% of patients. So far, no commercial vaccine is available. 

Thus, there remains a need for compositions and methods capable of promoting 
anti-HCV responses. 

1 5 SUMMARY OF THE INVENTION 

In one aspect, the present invention relates to isolated polypeptides comprising 
mutant hepatitis C ("HCV") polypeptides comprising at least portions of NS3, NS4, 
and NS5. In a preferred aspect, NS3 is encoded by a nucleic acid sequence having an 
N-terminal deletion to remove the catalytic domain. The NS mutant polypeptides can 

20 include NS3, NS4s, NS4b, NS5a, NS5b or portions thereof For example, in various 
embodiments, the mutant NS polypeptide comprises NS3, NS4 (NS4a and NS4b) and 
NS5 (NS5a and NS5b). In other embodiments, the NS polypeptide consists of NS3 and 
NS4 (for example, NS4a and/or NS4b) or NS3 and NS5 (for example, NS5a and/or 
NS5b). Other combinations of full-length or fragments of non-structural components 

25 are also contemplated. 

In another preferred aspect, the polypeptides further comprise a viral 
polypeptide that is not a non-structural HCV polypeptide. Such polypeptides are 
preferably C, or antigenic fragments thereof, more preferably, truncated C of HCV. 
Other polypeptides are preferably E, or antigenic fragments thereof, more preferably, 

30 El or E2 of HCV. Such polypeptides need not be encoded by a natural HCV genome, 
and include, for example, truncated or otherwise mutant HCV polypeptides or 
polypeptides derived from other genomes, such as, for example, polypeptides of HBV. 
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Thus, the invention includes an isolated mutant non-structural ("NS") HCV polypeptide 
comprising a polypeptide having a mutation in the catalytic domain of NS3 that 
functionally disrupts the catalytic domain. The mutation can be, for example, a 
deletion or a substitution mutation. In certain embodiments, the mutant NS polypeptide 

5 comprises NS3, NS4 and NS5. In other embodiments, the mutant NS polypeptides 
described herein further comprise a second viral polypeptide that is not NS3, NS4, or 
NS5 of HCV, for example an HCV Core polypeptide ("C"), or fragment thereof, or an 
HCV envelope protein ("E"), for example EI and/or E2. In certain embodiments, C is 
truncated {e,g,, at amino acid 121). 

10 In another aspect, the present invention relates to compositions comprising any 

of the mutant hepatitis C C*HCV") polypeptides described herein, for example 
polypeptides comprising at least portions of NS3, NS4, and NSS. In a preferred aspect, 
NS3 is encoded by a nucleic acid sequence having an N-tenninal deletion to disrupt the 
function of the catalytic domain, for example by removing this domain. In another 

1 S preferred aspect, the polypeptides further comprise a viral polypeptide that is not a non- 
structural HCV polypeptide. Such polypeptides are preferably C, or antigenic 
fragments thereof, more preferably, truncated C of HCV. Other polypeptides are 
preferably E, or antigenic fragments thereof, more preferably, El or E2 of HCV Such 
polypeptides need not be encoded by a natural HCV genome, and include, for example, 

20 truncated or otherwise mutant HCV polypeptides or polypeptides derived from other 
genomes, such as, for example, polypeptides of HBV. In another aspect, the invention 
includes a composition comprising (a) any of the polypeptides described herein; and (b) 
a pharmaceutically acceptable excipient {e.g., carrier and/or adjuvant). 

In another aspect, the invention includes an isolated and purified polynucleotide 

25 which encodes any of the mutant HCV polypeptides described herein. In certain 

embodiments, the invention includes a composition comprising (a) the isolated purified 
polynucleotide encoding any of the mutant HCV polypeptides; and (b) a 
pharmaceutically acceptable excipient. The polynucleotide, can be for example, DNA 
in a plasmid, or is in a plasmid. Additionally, the polynucleotides described herein may 

30 be included in an expression vector as shown in the attached Figures and Sequence 
Listings. 
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In another aspect, the present invention relates to host cells transformed with 
expression vectors comprising a nucleic acid sequence encoding a mutant HCV 
polypeptide comprising at least portions of NS3, NS4, and NS5. In a preferred aspect, 
the expression vectors of the host cells further comprises at least one nucleic acid 
5 sequence encoding a viral polypeptide that is not a non-structural HCV polypeptide. 
Such polypeptides are preferably C, or antigenic fragments thereof, more preferably, 
truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 
thereof, more preferably. El or E2 of HCV. Such polypeptides need not be encoded 
by a natural HCV genome, and include, for example, truncated or otherwise mutant 

10 HCV polypeptides or polypeptides derived from other genomes, such as, for example, 
polypeptides of HBV. In another preferred aspect the nucleic acid sequences of the 
expression vectors are coexpressed. In yet another preferred aspect, the host cells are 
yeast cells or mammalian cells. 

In another aspect, the present invention relates to expression vectors comprising 

15 a nucleic acid sequence encoding a mutant HCV polypeptide comprising NS3, NS4, 
and NS5. In a preferred aspect, the expression vectors of the host cells fiirther 
comprises at least one nucleic acid sequence encoding a viral polypeptide that is not a 
non-structural HCV polypeptide. Such polypeptides are preferably C, or antigenic 
fragments thereof, more preferably, truncated C of HCV. Other polypeptides are 

20 preferably E, or antigenic fragments thereof, more preferably. El or E2 of HCV. 

Importantly, such polypeptides need not be encoded by a natural HCV genome, such 
as, for example, truncated or otherwise mutant HCV polypeptides or polypeptides 
derived from other genomes, such as, for example, polypeptides of HBV. In another 
aspect, the present invention relates to methods of preparing a mutant HCV 

25 polypeptides. In a preferred aspect, the method comprises the steps of transforming a 
host cell with an expression vector, said vector comprising a nucleic acid sequence 
encoding a mutant HCV polypeptide comprising at least portions of NS3, NS4, and 
NS5, and isolating said polypeptide. In another preferred aspect the HCV polypeptide 
ftirther comprises a viral polypeptide that is not a non-structural HCV polypeptide. 

30 Such polypeptides are preferably C, or antigenic fragments thereof, more preferably, 
truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 
thereof, more preferably, Elor E2 of HCV. Such polypeptides need not be encoded by 



•5- 



wo 01/38360 



PCTAJSOO/32326 



a natural HCV genome, and include, for example, truncated or otherwise mutant HCV 
polypeptides or polypeptides derived from other genomes, such as, for example, 
polypeptides of HBV. In another preferred aspect the host cells are yeast cells or 
manmialian cells. 

5 In another aspect, the present invention relates to antibodies which specifically 

bind to mutant HCV polypeptide comprising NS3, NS4, and NS5, and to methods of 
making and using the same. In a preferred aspect, the HCV polypeptide further 
comprises a viral polypeptide that is not a non-structural HCV polypeptide. Such 
polypeptides are preferably C, or antigenic fragments thereof, more preferably, 

1 0 truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 

thereof, more preferably, El or E2 of HCV. Such polypeptides need not be encoded 
by a natural HCV genome, such as, for example, truncated or otherwise mutant HCV 
polypeptides or polypeptides derived from other genomes, and include, for example, 
polypeptides of HBV. In another preferred aspect, the antibody is either monoclonal or 

IS polyclonal. 

In yet another aspect, a method of preparing a mutant NS HCV polypeptide, 
wherein the method comprises the steps of (a) transforming a host cell with any of the 
expression vectors described herein, under conditions wherein the polypeptide is 
expressed; and (b) isolating the polypeptide. The host cell can be, for example, a yeast 
20 cell, a mammalian cell a plant cell or an insect cell. The polypeptide can be expressed 
and isolated intracellularly or can be secreted and isolated from the surrounding 
environment. 

In a still further aspect, a method of eliciting an immime response in a subject is 
provided. The inmiune response can be elicited by administering any of the 
25 polynucleotides and/or polypeptides described herein in one or multiple doses. 

These and other embodiments of the subject invention will readily occur to 
those of skill in the art in light of the disclosure herein. 

BRIEF DESCRIPTION OF THE FIGURES 

30 FIG. 1 shows the cloning scheme for generating pCMV-NS35. 
FIG. 2 shows the 962lbp vector pCMV-NS35. 
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FIG. 3 shows the nucleic acid sequence of pCMV-NS35 (SEQ ID N0:1), including the 
nucleic acid sequence of the NS35 ORF, and also the translation of NS35 (SEQ ID 
N0:2). 

FIG. 4 shows the 9621bp pCMV-delNS35. 
5 FIG. 5 shows the nucleic acid sequence of pCMV-delNS35 (SEQ ID N0:3), including 
the nucleic acid sequence of the delNS35 ORF, and also the translation of the delNS35 
polypeptide (SEQ ID N0:4). 
FIG. 6 shows the 4276bp pCMV-II. 

FIG. 7 shows the nucleic acid sequence of pCMV-II (SEQ ID N0:5). 
10 FIG. 8 shows the 6300bp pCMV-NS34A. 

FIG. 9 shows the nucleic acid sequence of pCMV-NS34A (SEQ ID N0:6), including 
the nucleic acid sequence of the NS34A ORF, and also the translation of NS34A (SEQ 
IDNO:7). 

FIG. 10 shows the cloning scheme for generating pd.ANS3NSS. 
IS FIG. 1 1 shows the nucleic and amino acid sequences of pd.ANS3NSS (SEQ ID NO:8 
and 9). 

FIG. 12 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5. 

FIG. 13 shows the cloning scheme for generating pd.ANS3NS5.pj. 
20 FIG. 14 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj (SEQ ID 
NOilOandll). 

FIG. 15 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5.pj, specifically demonstrating the expression of 
ANS3NS5 polypeptide. 
25 FIG. 16 shows the cloning scheme for generating pdANS3NS5.pj.corel21RT and 
pdANS3NS5.pj.corel73RT. 

FIG. 17 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj.corel21 (SEQ 
IDNO:12andl3). 

FIG. 18 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj.corel73 (SEQ 
30 IDNO:14andl5). 

FIG. 19 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5.pj, specifically demonstrating the expression of 
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ANS3NS5.corel21 and ANS3NS5. core 173 polypeptides. Lanes 1 and 7 show See 
Blue Standards. Lane 2 shows control yeast plasmid. Lanes 3 and 4 show 
ANS3NS5.corel21RT polypeptide, colonies 1 and 2. Lanes 5 and 6 show 
ANS3NS5.corel73RT polypeptide, colonies 3 and 4. 
5 FIG. 20 shows the cloning scheme for generating pdANS3NS5.pj.corel40RT and 
pdANS3NS5.pj.corel SORT. 

FIG. 21 shows the nucleic and amino acid sequences of pd.ANS3NSS.pj.corel40 (SEQ 
ID NO: 16 and 17). 

FIG. 22 shows the nucleic and amino acid sequences of pd.ANS3NSS.pj.corelS0 (SEQ 

10 IDNO:18andl9). 

FIG. 23 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NSS.pj, specifically demonstrating the expression of 
ANS3NS5corel40 and ANS3NS5corel50 polypeptides. Lane 1 shows See Blue 
Standards. Lanes 2 and 3 show ANS3NSSScorel40RT polypeptide, colonies 5 and 6. 

IS Lanes 4 and S show ANS3NSScorelS0RT polypeptide, colonies 7 and 8. Lane 6 shows 
control yeast plasmid. Lane 7 shows ANS3NSScorel21RT polypeptide, colony 1. 
Lane 8 shows ANS3NS5corel73RT polypeptide, colony S. 

DETAILED DESCRIPTION OF THE INVENTION 

20 The practice of the present invention will employ, unless otherwise indicated, 

conventional techniques of molecular biology, microbiology, recombinant DNA 
techniques, and immunology, which are within the skill of the art. Such techniques are 
explained fully in the literature. See e.g., Sambrook, et al, MOLECULAR CLONING; 
A LABORATORY MANUAL (1 989); DNA CLONING, VOLUMES I AND II (D. N. 

2S Glover ed. 198S); OLIGONUCLEOTIDE SYNTHESIS (M. J. Gait ed., 1984); 
NUCLEIC ACID HYBRIDIZATION (B. D. Hames & S. J. Higgins eds. 1984); 
TRANSCRIPTION AND TRANSLATION (B. D. Hames & S. J. Higgins eds. 1984); 
ANIMAL CELL CULTURE (R. I. Freshney ed. 1986); IMMOBILIZED CELLS AND 
ENZYMES (IRL Press, 1986); B. Perbal, A PRACTICAL GUIDE TO MOLECULAR 

30 CLONING (1984); the series, METHODS OF ENZYMOLOGY (Academic Press, 

Inc.); GENE TRANSFER VECTORS FOR MAMMALIAN CELLS (J. H. Miller and 
M. P. Calos eds. 1987, Cold Springs Harbor Laboratory), Methods in Enzymology Vol. 
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154 and Vol. 155 (Wu and Grossman, and Wu, eds., respectively); Mayer and Walker 
eds. (1987), MMUNOHISTOCHEMICAL METHODS IN CELL AND 
MOLECULAR BIOLOGY (Academic Press, London); Scopes, (1987), PROTEIN 
PURIFICATION: PRINCIPALS AND PRACTICE, Second Edition (Springer- Verlag, 
5 New York); and HANDBOOK OF EXPERIMENTAL IMMUNOLOGY, VOLUMES 
I-IV (D. M. Weir and C. C. Blackwell eds. 1986). 

It must be noted that, as used in this specification and the appended claims, the 
singular forms "a", "an" and "the" include plural referents imless the content clearly 
dictates otherwise. Thus, for example, reference to "an antigen" includes a mixture of 
1 0 two or more antigens, and the like. 

I. Definitions 

In describing the present invention, the following terms will be employed, and 
are intended to be defined as indicated below. 

1 5 The term "hepatitis C virus" (HCV) refers to an agent causative of Non-A, Non- 

B Hepatitis (NANBH). The nucleic acid sequence and putative amino acid sequence of 
HCV is described in U.S. Patent Nos. 5,856,437 and 5,350,671. The disease caused by 
HCV is called hepatitis C, formerly called NANBH. The term HCV, as used herein, 
denotes a viral species of which pathenogenic strains cause NANBH, as well as 

20 attenuated strains or defective interfering particles derived therefrom. 

HCV is a member of the viral family flaviviridae. The morphology and 
composition of Flavivirus particles are known, and are discussed in Reed et al, Curr. 
Stud, Hematol Blood Transfus, (1998), 62:1-37; HEPATITIS C VIRUSES IN FIELDS 
VIROLOGY (B.N. Fields, D.M. Knipe, P.M. Howley, eds.) (3d ed. 1996). It has 

25 recently been found that portions of the HCV genome are also homologous to 

pestiviruses. Generally, with respect to morphology, Flaviviruses contain a central 
nucleocapsid surrounded by a lipid bilayer. Virions are spherical and have a diameter 
of about 40-50 nm. Their cores are about 25-30 nm in diameter. Along the outer 
surface of the virion envelope are projections that are about 5-10 nm long with terminal 

30 knobs about 2 nm in diameter. 

The HCV genome is comprised of RNA. It is known that RNA containing 
viruses have relatively high rates of spontaneous mutation. Therefore, there can be 
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multiple strains, which can be virulent or avinilent, within the HCV class or species. 
The ORF of HCV, including the translation spans of the core, non-stmctural, and 
envelope proteins, is shown in U.S. Patent Nos. 5,856,437 and 5,350,671. 

The terms "polypeptide" and "protein" refer to a polymer of amino acid 

5 residues and are not limited to a minimum length of the product. Thus, peptides, 

oligopeptides, dimers, multimers, and the like, are included within the definition. Both 
full-length proteins and fi-agments thereof are encompassed by the definition. The 
terms also include postexpression modifications of the polypeptide, for example, 
glycosylation, acetylation, phosphorylation and the like. Furthermore, for purposes of 

10 the present invention, a "polypeptide" refers to a protein which includes modifications, 
such as deletions, additions and substitutions (generally conservative in nature), to the 
native sequence, so long as the protein maintains the desired activity. These 
modifications may be deliberate, as through site-directed mutagenesis, or may be 
accidental, such as through mutations of hosts which produce the proteins or errors due 

1 5 to PGR amplification. 

An HCV polypeptide is a polypeptide, as defined above, derived firom the HCV 
polyprotein. The polypeptide need not be physically derived firom HCV, but may be 
synthetically or recombinantly produced. Moreover, the polypeptide may be derived 
fi-om any of the various HCV strains, such as fi-om strains 1 , 2, 3 or 4 of HCV. A 

20 number of conserved and variable regions are known between these strains and, in 
general, the amino acid sequences of epitopes derived fi'om these regions will have a 
high degree of sequence homology, e.g., amino acid sequence homology of more than 
30%, preferably more than 40%, when the two sequences are aligned and homology 
detemiined by any of the programs or algorithms described herein. Thus, for example, 

25 the term "NS4" polypeptide refers to native NS4 firom any of the various HCV strains, 
as well as NS4 analogs, muteins and immunogenic fi*agments, as defined fiirther below. 

Further, the tenns "ANS35," "deINS35," "ANS3NS5," and "ANS3-5" as used 
herein refer to a mutant polypeptide, comprising at least portions of NS3, NS4, or NS5, 
comprising a deletion in, or mutation of, the NS3 protease active site region to render 

30 the protease non-fimctional. In one embodiment, ANS3-5 comprises amino acids 1242- 
301 1 , as shown in FIG. 5, or polypeptides substantially homologous thereto. It will be 
readily apparent to one of ordinary skill in the art how to determine that NS3 protease 



-10- 



wo 01/38360 



PCTAJSOD/32326 



has been rendered non-functional. If the protease is functional, one will obtain protein 
of the expected molecular weight upon expression. As set forth in Example 2 and 
Figure 15, using SDS-page, 4-20%, a protein having a molecular weight of 
approximately 194kD was obtained when strain AD3 was transformed with 
5 pd.ANS3NS5.PJ clone #5. One skilled in the art could readily determine whether a 
protein of the desired molecular weight was expressed for any given deletion or 
mutation. 

The terms "analog" and "mutein" refer to biologically active derivatives of the 
reference molecule, or fragments of such derivatives, that retain desired activity, such 

10 as the ability to stimulate a cell-mediated immune response, as defined below. In 

general, the term "analog" refers to compounds having a native polypeptide sequence 
and structure with one or more amino acid additions, substitutions (generally 
conservative in nature) and/or deletions, relative to the native molecule, so long as the 
modifications do not destroy immunogenic activity. The term '*mutein" refers to 

1 5 peptides having one or more peptide mimics C*peptoids"), such as those described in 
International Publication No. WO 91/04282. Preferably, the analog or mutein has at 
least the same inununoactivity as the native molecule. Methods for making 
polypeptide analogs and muteins are known in the art and are described further below. 
Particularly preferred analogs include substitutions that are conservative in 

20 nature, i.e., those substitutions that take place within a family of amino acids that are 
related in their side chains. Specifically, amino acids are generally divided into four 
families: (I) acidic — aspartate and glutamate; (2) basic - lysine, arginine, histidine; 
(3) non-polar - alanine, valine, leucine, isoleucine, proUne, phenylalanine, methionine, 
tryptophan; and (4) uncharged polar - glycine, asparagine, glutamine, cysteine, serine 

25 threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes classified 
as aromatic amino acids. For example, it is reasonably predictable that an isolated 
replacement of leucine with isoleucine or valine, an aspartate with a glutamate, a 
threonine with a serine, or a similar conservative replacement of an amino acid with a 
structurally related amino acid, will not have a major effect on the biological activity. 

30 For example, the polypeptide of interest may include up to about 5-10 conservative or 
non-conservative amino acid substitutions, or even up to about 15-25 conservative or 
non-conservative amino acid substitutions, or any integer between 5^25, so long as the 
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desired fimction of the molecule remains intact. One of skill in the art may readily 
determine regions of the molecule of interest that can tolerate change by reference to 
Hopp/Woods and Kyte-Doolittle plots, well known in the art. 

By "fragment" is intended a polypeptide consisting of only a part of the intact 
5 full-length polypeptide sequence and structure. The fragment can include a C-terminal 
deletion and/or an N-terminal deletion of the native polypeptide. An "immunogenic 
fragment" of a particular HCV protein will generally include at least about S-10 
contiguous amino acid residues of the iuU-Iength molecule, preferably at least about 
1S-2S contiguous amino acid residues of the full-length molecule, and most preferably 

10 at least about 20-50 or more contiguous amino acid residues of the full-length 

molecule, that define an epitope, or any integer between S amino acids and the full- 
length sequence, provided that the fragment in question retains immunogenic activity, 
as measured by the assays described herein. For a description of various HCV 
epitopes, see, e.g., Chien et al., Proc. Natl Acad. ScL USA (1992) 82:1001 1-10015; 

1 5 Chien et al., J. Gastroent Hepatol. (1993) 8:S33-39; Chien et aL, International 
Publication No. WO 93/00365; Chien, D.Y., International Publication No. WO 
94/01778; commonly owned, allowed U.S. Patent Application Serial Nos. 08/403,590 
and 08/444,818. 

The temi "epitope" as used herein refers to a sequence of at least about 3 to 5, 
20 preferably about 5 to 10 or 15, and not more than about 1,000 amino acids (or any 
integer therebetween), which defme a sequence that by itself or as part of a larger 
sequence, binds to an antibody generated in response to such sequence. There is no 
critical upper limit to the length of the fragment, which may comprise nearly the full- 
length of the protein sequence, or even a fusion protein comprising two or more 
25 epitopes from the HCV polyprotein. An epitope for use in the subject invention is not 
limited to a polypeptide having the exact sequence of the portion of the parent protein 
from which it is derived. Indeed, viral genomes are in a state of constant flux and 
contain several variable domains which exhibit relatively high degrees of variability 
between isolates. Thus the term "epitope" encompasses sequences identical to the 
30 native sequence, as well as modifications to the native sequence, such as deletions, 
additions and substitutions (generally conservative in nature). 

-12- 



wo 01/38360 



PCTAJSOO/32326 



Regions of a given polypeptide that include an epitope can be identified using 
any number of epitope mapping techniques, well known in the art. See, e.g., Epitope 
Mapping Protocols in Methods in Molecular Biology, Vol. 66 (Glenn E. Morris, Ed., 
1996) Humana Press, Totowa, New Jersey. For example, linear epitopes may be 
5 determined by e.g., concurrently synthesizing large numbers of peptides on solid 

supports, the peptides corresponding to portions of the protein molecule, and reacting 
the peptides with antibodies while the peptides are still attached to the supports. Such 
techniques are known in the art and described in, e.g., U.S. Patent No. 4,708,871; 
Geysenetal. (1984) Proc. Natl Acad, Sci, 81 :3998-4002; Geysen et al. (1986) 

10 Molec. Immunol 23:709-715. Similarly, conformational epitopes are readily 

identified by determining spatial conformation of amino acids such as by, e.g., x-ray 
crystallography and 2-dimensional nuclear magnetic resonance. See, e.g.. Epitope 
Mapping Protocols^ supra. Antigenic regions of proteins can also be identified using 
standard antigenicity and hydropathy plots, such as those calculated using, e.g., the 

1 5 Omiga version 1 .0 software program available from the Oxford Molecular Group. This 
computer program employs the Hopp/Woods method, Hopp et al., Proc, Natl Acad, 
Sci USA (1981) 78:3824-3828 for determining antigenicity profiles, and the Kyte- 
Doolittle technique, Kyte et al., J. Mol Biol (1 982) 157: 1 05-1 32 for hydropathy plots. 
As used herein, the term "conformational epitope" refers to a portion of a fiill- 

20 length protein, or an analog or mutein thereof, having structural features native to the 

amino acid sequence encoding the epitope within the fiilMength natural protein. Native 
structural features include, but are not limited to, glycosylation and three dimensional 
structure. Preferably, a conformational epitope is produced recombinantly and is 
expressed in a cell fi-om which it is extractable under conditions which preserve its 

25 desired structural features, e.g. without denaturation of the epitope. Such cells include 
bacteria, yeast, insect, and mammalian cells. Expression and isolation of recombinant 
conformational epitopes firom the HCV polyprotein are described in e.g.. International 
Publication Nos. WO 96/04301, WO 94/01778, WO 95/33053, WO 92/08734. 

An "immunological response" to an HCV antigen (including both polypeptide 

30 and polynucleotides encoding polypeptides that are expressed in vivo) or composition is 
the development in a subject of a humoral and/or a cellular inunune response to 
molecules present in the composition of interest. For purposes of the present invention. 
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a "humoral immune response" refers to an immune response mediated by antibody 
molecules, while a "cellular immune response" is one mediated by T-lymphocytes 
and/or other white blood cells. One important aspect of cellular immunity involves an 
antigen-specific response by cytolytic T-cells ("CTLs"). CTLs have specificity for 
5 peptide antigens that are presented in association with proteins encoded by the major 
histocompatibility complex (MHC) and expressed on the surfaces of cells. CTLs help 
induce and promote the intracellular destruction of intracellular microbes, or the lysis 
of cells infected with such microbes. Another aspect of cellular immunity involves an 
antigen-specific response by helper T-cells. Helper T-cells act to help stimulate the 

1 0 function, and focus the activity of, nonspecific effector cells against cells displaying 
peptide antigens in association with MHC molecules on their surface. A "cellular 
immune response" also refers to the production of cytokines, chemokines and other 
such molecules produced by activated T-cells and/or other white blood cells, including 
those derived from CD4+ and CD8+ T-cells. 

1 5 A composition or vaccine that elicits a cellular immune response may serve to 

sensitize a vertebrate subject by the presentation of antigen in association with MHC 
molecules at the cell surface. The cell-mediated immune response is directed at, or 
near, cells presenting antigen at their surface. In addition, antigen-specific T- 
lymphocytes can be generated to allow for the future protection of an immunized host. 

20 The ability of a particular antigen to stimulate a cell-mediated immunological 

response may be determined by a number of assays, such as by lymphoproliferation 
(lymphocyte activation) assays, CTL cytotoxic cell assays, or by assaying for T- 
lymphocytes specific for the antigen in a sensitized subject. Such assays are well 
known in the art. See, e.g., Erickson et al., J. Immunol. (1993) 151:4189-4199; Doe et 

25 al., Eur. J. Immunol, (1994) 24:2369-2376; and the examples below. 

Thus, an immunological response as used herein may be one which stimulates 
the production of CTLs, and/or the production or activation of helper T- cells. The 
antigen of interest may also elicit an antibody-mediated immune response. Hence, an 
immunological response may include one or more of the following effects: the 

30 production of antibodies by B-cells; and/or the activation of suppressor T-cells and/or 
y6 T-cells directed specifically to an antigen or antigens present in the composition or 
vaccine of interest. These responses may serve to neutralize infectivity, and/or mediate 
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antibody-complement, or antibody dependent cell cytotoxicity (ADCC) to provide 
protection or alleviation of symptoms to an immunized host. Such responses can be 
determined using standard immunoassays and neutralization assays, well known in the 
art. 

5 A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a 

nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the 
case of mRNA) into a polypeptide in vitro or in vivo when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are 
determined by a start codon at the S* (amino) terminus and a translation stop codon at 

10 the 3* (carboxy) terminus. A transcription tennination sequence may be located 3* to 
the coding sequence. 

A "nucleic acid" molecule or "polynucleotide" can include both double- and 
single-stranded sequences and refers to, but is not limited to, cDNA from viral, 
procaryotic or eucaryotic mRNA, genomic DNA sequences from viral (e.g. DNA 

IS viruses and retroviruses) or procaryotic DNA, and especially synthetic DNA sequences. 
The term also captures sequences that include any of the known base analogs of DNA 
andRNA. 

"Operably linked" refers to an arrangement of elements wherein the 
components so described are configured so as to perform their desired function. Thus, 

20 a given promoter operably linked to a coding sequence is capable of effecting the 
expression of the coding sequence when the proper transcription factors, etc., are 
present. The promoter need not be contiguous with the coding sequence, so long as it 
functions to direct the expression thereof. Thus, for example, intervening untranslated 
yet transcribed sequences can be present between the promoter sequence and the coding 

25 sequence, as can transcribed introns, and the promoter sequence can still be considered 
"operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, viral, semisynthetic, or synthetic origin which, by 
virtue of its origin or manipulation is not associated with all or a portion of the 

30 polynucleotide with which it is associated in nature. The term "recombinant" as used 
with respect to a protein or polypeptide means a polypeptide produced by expression of 
a recombinant polynucleotide. In general, the gene of interest is cloned and then 



-15- 



wo 01/38360 



PCTAJSOO/32326 



expressed in transformed organisms, as described further below. The host organism 
expresses the foreign gene to produce the protein under expression conditions. 

A "control element" refers to a polynucleotide sequence which aids in the 
expression of a coding sequence to which it is linked. The term includes promoters, 

5 transcription termination sequences, upstream regulatory domains, polyadenylation 
signals, untranslated regions, including 5'-UTRs and 3'-UTRs and when appropriate, 
leader sequences and enhancers, which collectively provide for the transcription and 
translation of a coding sequence in a host cell. 

A "promoter" as used herein is a DNA regulatory region capable of binding 

1 0 RNA polymerase in a host cell and initiating transcription of a downstream (3* 
direction) coding sequence operably linked thereto. For purposes of the present 
invention, a promoter sequence includes the minimum number of bases or elements 
necessary to initiate transcription of a gene of interest at levels detectable above 
background. Within the promoter sequence is a transcription initiation site, as well as 

1 5 protein binding domains (consensus sequences) responsible for the binding of RNA 
polymerase. Eucaryotic promoters will often, but not always, contain *TATA" boxes 
and "CAT" boxes. 

A control sequence "directs the transcription" of a coding sequence in a cell 
when RNA polymerase will bind the promoter sequence and transcribe the coding 
20 sequence into mRNA, which is then translated into the polypeptide encoded by the 
coding sequence. 

"Expression cassette" or "expression construct" refers to an assembly which is 
capable of directing the expression of the sequence(s) or gene(s) of interest. The 
expression cassette includes control elements, as described above, such as a promoter 

25 which is operably linked to (so as to direct transcription of) the sequence(s) or gene(s) 
of interest, and often includes a polyadenylation sequence as well. Within certain 
embodiments of the invention, the expression cassette described herein may be 
contained within a plasmid construct. In addition to the components of the expression 
cassette, the plasmid construct may also include, one or more selectable markers, a 

30 signal which allows the plasmid construct to exist as single-stranded DNA (e.g., a M13 
origin of replication), at least one multiple cloning site, and a "mammalian" origin of 
replication (e.g., a SV40 or adenovirus origin of replication). 
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'Transfonnation," as used herein, refers to the insertion of an exogenous 
polynucleotide into a host cell, irrespective of the method used for insertion: for 
example, transformation by direct uptake, transfection, infection, and the like. For 
particular methods of transfection, see further below. The exogenous polynucleotide 
5 may be maintained as a nonintegrated vector, for example, an episome, or alternatively, 
may be integrated into the host genome. 

A ''host ceir' is a cell which has been transformed, or is capable of 
transformation, by an exogenous DNA sequence. 

By "isolated" is meant, when referring to a polypeptide, that the indicated 
10 molecule is separate and discrete from the whole organism with which the molecule is 
found in nature or is present in the substantial absence of other biological macro- 
molecules of the same type. The term "isolated" with respect to a polynucleotide is a 
nucleic acid molecule devoid, in whole or part, of sequences normally associated with 
it in nature; or a sequence, as it exists in nature, but having heterologous sequences in 
1 5 association therewith; or a molecule disassociated &om the chromosome. 

The term "purified" as used herein preferably means at least 75% by weight, 
more preferably at least 85% by weight, more preferably still at least 95% by weight, 
and most preferably at least 98% by weight, of biological macromolecules of the same 
type are present. 

20 "Homology" refers to the percent identity between two polynucleotide or two 

polypeptide moieties. Two DNA, or two polypeptide sequences are "substantially 
homologous" to each other when the sequences exhibit at least about 50% , preferably 
at least about 75%, more preferably at least about 80%-85%, preferably at least about 
90%, and most preferably at least about 95%-98%, or more, sequence identity over a 

25 defined length of the molecules. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. 
The term "substantially homologous" as used herein in reference to ANS35 generally 
refers to an HCV nucleic or amino acid sequence that is at least 60% identical to the 
entire sequence of the polypeptide encoded by ANS35 (see FIG. 5), where the sequence 

30 identity is preferably at least 75%, more preferably at least 80%, still more preferably at 
least about 85%, especially more than about 90%, most preferably 95% or greater, 
particularly 98% or greater. These homologous polypeptides include firagments. 
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including mutants and allelic variants of the fragments. Identity between the two 
sequences is preferably determined by the Smith-Waterman homology search algorithm 
as implemented in the MPSRCH program (Oxford Molecular), using an affme gap 
search with parameters gap open penal ty==l2 and gap extension penal ty=\ . Thus, for 

5 example, the present invention includes an isolate which is 80% identical to a 

polypeptide encoded by ANS35. In some aspects of the invention, the polypeptide of 
the present invention is substantially homologous to the ANS35. 

In general, '^identity*' refers to an exact nucleotide-to-nucleotide or amino acid- 
to-amino acid correspondence of two polynucleotides or polypeptide sequences, 

1 0 respectively. Percent identity can be determined by a direct comparison of the 

sequence information between two molecules by aligning the sequences, counting the 
exact number of matches between the two aligned sequences, dividing by the length of 
the shorter sequence, and multiplying the result by 100. Readily available computer 
programs can be used to aid in the analysis, such as ALIGN, DayhofT, M.O. in Atlas of 

1 5 Protein Sequence and Structure M.O. Dayhoff ed., 5 Suppl. 3:353-358, National 

biomedical Research Foundation, Washington, DC, which adapts the local homology 
algorithm of Smith and Vfdtexmm Advances in AppL Math. 2:482-489, 1981 for 
peptide analysis. Programs for determining nucleotide sequence identity are available 
in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics 

20 Computer Group, Madison, WI) for example, the BESTFIT, FASTA and GAP 

programs, which also rely on the Smith and Waterman algorithm. These programs are 
readily utilized with the default parameters recommended by the manufacturer and 
described in the Wisconsin Sequence Analysis Package referred to above. For 
example, percent identity of a particular nucleotide sequence to a reference sequence 

25 can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. 

Another method of establishing percent identity in the context of the present 
invention is to use the MPSRCH package of programs copyrighted by the University of 
Edinburgh, developed by John F. Collins and Shane S. Sturrok, and distributed by 

30 IntelliGenetics, Inc. (Mountain View, CA). From this suite of packages the Smith- 
Waterman algorithm can be employed where default parameters are used for the 
scoring table (for example, gap open penalty of 1 2, gap extension penalty of one, and a 
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gap of six). From the data generated the "Match" value reflects "sequence identity." 
Other suitable programs for calculating the percent identity or similarity between 
sequences are generally known in the art, for example, another alignment program is 
BLAST, used with default parameters. For example, BLASTN and BLASTP can be 

5 used using the following default parameters: genetic code = standard; filter = none; 
strand = both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 
sequences; sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + 
DDBJ + PDB + GenBank CDS translations + Swiss protein + Spupdate + PIR. Details 
of these programs can be found at the following internet address: 

10 http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

Alternatively, homology can be determined by hybridization of polynucleotides 
under conditions which form stable duplexes between homologous regions, followed 
by digestion with single-stranded-specific nuclease(s), and size determination of the 
digested fragments. DNA sequences that are substantially homologous can be 

IS identified in a Southern hybridization experiment under, for example, stringent 

conditions, as defined for that particular system. Defining appropriate hybridization 
conditions is within the skill of the art. See, e.g., Sambrook et al, supra\ DNA Clonings 
supra; Nucleic Acid Hybridization, supra. 

"Stringency" refers to conditions in a hybridization reaction that favor 

20 association of very similar sequences over sequences that differ. For example, the 
combination of temperature and salt concentration should be chosen that is 
approximately 120 to 200^C below the calculated Tm of the hybrid under study. The 
temperature and salt conditions can often be determined empirically in preliminary 
experiments in which samples of genomic DNA immobilized on filters are hybridized 

25 to the sequence of int^est and then washed under conditions of different stringencies. 
See Sambrook et al at page 9.S0. 

Variables to consider when performing, for example, a Southern blot are (1) the 
complexity of the DNA being blotted and (2) the homology between the probe and the 
sequences being detected. The total amount of the fragment(s) to be studied can vary a 

30 magnitude of 10, from 0. 1 to 1 |xg for a plasmid or phage digest to 10'^ to 10'^ g for a 
single copy gene in a highly complex eukaryotic genome. For lower complexity 
polynucleotides, substantially shorter blotting, hybridization, and exposure times, a 
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smaller amount of starting polynucleotides, and lower specific activity of probes can be 
used. For example, a single-copy yeast gene can be detected with an exposure time of 
only 1 hour starting with 1 ng of yeast DNA, blotting for two hours, and hybridizing 
for 4-8 hours with a probe of 10^ cpm/ng. For a single-copy mammalian gene a 

5 conservative approach would start with 10 ^g of DNA, blot overnight, and hybridize 
overnight in the presence of 10% dextran sulfate using a probe of greater than 10* 
cpm/iig, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid 
between the probe and the fragment of interest, and consequently, the appropriate 

1 0 conditions for hybridization and washing. In many cases the probe is not 100% 
homologous to the fragment. Other commonly encountered variables include the 
length and total G+C content of the hybridizing sequences and the ionic strength and 
formamide content of the hybridization buffer. The effects of all of these factors can be 
approximated by a single equation: 

1 5 Tm= 81 + 1 6.6(log,oCi) + 0.4[%(G + C)]-0.6(%formamide) - eOO/n-l .5(%mismatch). 
where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in 
base pairs (slightly modified from Meinkoth & Wahl (1984) AnaL Biochem. 1 38: 267- 
284). In general, convenient hybridization temperatures in the presence of 50% 
formamide are 42**C for a probe with is 95% to 100% homologous to the target 

20 fragment, 37X for 90% to 95% homology, and 32*'C for 85% to 90% homology. For 
lower homologies, formamide content should be lowered and temperature adjusted 
accordingly, using the equation above. If the homology between the probe and the 
target fragment are not known, the simplest approach is to start with both hybridization 
and wash conditions which are nonstringent. If non-specific bands or high background 

25 are observed after autoradiography, the filter can be washed at high stringency and 
reexposed. If the time required for exposure makes this approach impractical, several 
hybridization and/or washing stringencies should be tested in parallel. 

By "nucleic acid immunization" is meant the introduction of a nucleic acid 
molecule encoding one or more selected antigens into a host cell, for the in vivo 

30 expression of the antigen or antigens. The nucleic acid molecule can be introduced 
directly into the recipient subject, such as by injection, inhalation, oral, intranasal and 
mucosal administration, or the like, or can be introduced ex vivo, into cells which have 
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been removed from the host. In the latter case, the transformed cells are reintroduced 
into the subject where an inunune response can be mounted against the antigen encoded 
by the nucleic acid molecule. 

An "open reading frame" or ORF is a region of a polynucleotide sequence 
S which encodes a polypeptide; this region can represent a portion of a coding sequence 
or a total coding sequence. 

As used herein, the term "antibody" refers to a polypeptide or group of 
polypeptides which comprise at least one antigen binding site. An "antigen binding 
site" is formed from the folding of the variable domains of an antibody molecule(s) to 

10 form three-dimensional binding sites with an internal surface shape and charge 

distribution complementary to the features of an epitope of an antigen, which allows 
specific binding to form an antibody-antigen complex. An antigen binding site may be 
formed from a heavy- and/or light-chain domain (VH and VL, respectively), which 
form hypervariable loops which contribute to antigen binding. The term "antibody" 

15 includes, without limitation, polyclonal antibodies, monoclonal antibodies, chimeric 
antibodies, altered antibodies, univalent antibodies, Fab proteins, and single-domain 
antibodies. In many cases, the binding phenomena of antibodies to antigens is 
equivalent to other ligand/anti-ligand binding. 

If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, 

20 goat, horse, etc.) is immunized with an immunogenic polypeptide bearing an HCV 
epitope(s). Serum from the immunized animal is collected and treated according to 
known procedures. If serum containing polyclonal antibodies to an HCV epitope 
contains antibodies to other antigens, the polyclonal antibodies can be purified by 
immunoaffinity chromatography. Techniques for producing and processing polyclonal 

25 antisera are known in the art, see for example, Mayer and Walker, eds. (1987) 
IMMUNOCHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY 
(Academic Press, London). 

Monoclonal antibodies directed against HCV epitopes can also be readily 
produced by one skilled in the art. The general methodology for making monoclonal 

30 antibodies by hybridomas is well known. Immortal antibody-producing cell lines can 
be created by cell fiision, and also by other techniques such as direct transformation of 
B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., 
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M. Schreier et al. (1980) HYBRIDOMA TECHNIQUES; Hammerling et aL (1981), 
MONOCLONAL ANTIBODIES AND T-CELL HYBRIDOMAS; Kennett et al. 
(1980) MONOCLONAL ANTIBODIES; see also, U.S. Pat. Nos. 4,341,761; 4,399,121; 
4,427,783; 4,444,887; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels of 

5 monoclonal antibodies produced against HCV epitopes can be screened for various 
properties; i.e., for isotype, epitope affinity, etc. As used herein, a "single domain 
antibody" (dAb) is an antibody which is comprised of an HL domain, which binds 
specifically with a designated antigen. A dAb does not contain a VL domain, but may 
contain other antigen binding domains known to exist to antibodies, for example, the 

10 kappa and lambda domains. Methods for preparing dabs are known in the art. See, for 
example, Ward et al. Nature 341 : 544 (1989). 

Antibodies can also be comprised of VH and VL domains, as well as other 
known antigen binding domains. Examples of these types of antibodies and methods 
for their preparation and known in the art (see, e.g., U.S. Pat. No. 4,816,467), and 

1 5 include the following. For example, "vertebrate antibodies" refers to antibodies which 
are tetramers or aggregates thereof, comprising light and heavy chains which are 
usually aggregated in a "Y" configuration and which may or may not have covalent 
linkages between the chains. In vertebrate antibodies, the amino acid sequences of the 
chains are homologous with those sequences found in antibodies produced in 

20 vertebrates, whether in situ or in vitro (for example, in hybridomas). Vertebrate 
antibodies include, for example, purified polyclonal antibodies and monoclonal 
antibodies, methods for the preparation of which are described infra. 

"Hybrid antibodies" are antibodies where chains are separately homologous 
with reference to mammalian antibody chains and represent novel assemblies of them, 

25 so that two different antigens are precipitable by the tetramer or aggregate. In hybrid 
antibodies, one pair of heavy and light chains are homologous to those found in an 
antibody raised against a first antigen, while a second pair of chains are homologous to 
those found in an antibody raised against a second antibody. This results in the property 
of "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids can 

30 also be formed using chimeric chains, as set forth below. 

"Chimeric antibodies" refers to antibodies in which the heavy and/or light 
chains are fusion proteins. Typically, one portion of the amino acid sequences of the 
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chain is homologous to corresponding sequences in an antibody derived from a 
particular species or a particular class, while the remaining segment of the chain is 
homologous to the sequences derived from another species and/or class. Usually, the 
variable region of both light and heavy chains mimics the variable regions or antibodies 
5 derived from one species of vertebrates, while the constant portions are homologous to 
the sequences in the antibodies derived from another species of vertebrates. However, 
the definition is not limited to this particular example. Also included is any antibody in 
which either or both of the heavy or light chains are composed of combinations of 
sequences mimicking the sequences in antibodies of different sources, whether these 

1 0 sources be from differing classes or different species of origin, and whether or not the 
fusion point is at the variable/constant boundary. Thus, it is possible to produce 
antibodies in which neither the constant nor the variable region mimic know antibody 
sequences. It then becomes possible, for example, to construct antibodies whose 
variable region has a higher specific affinity for a particular antigen, or whose constant 

1 S region can elicit enhanced complement fixation, or to make other improvements in 
properties possessed by a particular constant region. 

Another example is "altered antibodies", which refers to antibodies in which the 
naturally occurring amino acid sequence in a vertebrate antibody has been varies. 
Utilizing recombinant DNA techniques, antibodies can be redesigned to obtain desired 

20 characteristics. The possible variations are many, and range from the changing of one 
or more amino acids to the complete redesign of a region, for example, the constant 
region. Changes in the constant region, in general, to attain desired cellular process 
characteristics, e.g., changes in complement fixation, interaction with membranes, and 
other effector functions. Changes in the variable region can be made to alter antigen 

25 binding characteristics. The antibody can also be engineered to aid the specific delivery 
of a molecule or substance to a specific cell or tissue site. The desired alterations can be 
made by known techniques in molecular biology, e.g., recombinant techniques, site- 
directed mutagenesis, etc. 

Yet another example are "univalent antibodies", which are aggregates 

30 comprised of a heavy-chain/light-chain dimer bound to the Fc (i.e., stem) region of a 
second heavy chain. This type of antibody escapes antigenic modulation. See, e.g., 
Glennie et al. Nature 295: 712 (1982). Included also within the definition of antibodies 
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are "Fab" fragments of antibodies. The "Fab" region refers to those portions of the 
heavy and light chains which are roughly equivalent, or analogous, to the sequences 
which comprise the branch portion of the heavy and light chains, and which have been 
shown to exhibit immunological binding to a specified antigen, but which lack the 
5 effector Fc portion. "Fab" includes aggregates of one heavy and one light chain 
(commonly known as Fab*), as well as tetramers containing the 2H and 2L chains 
(referred to as F(ab)2), which are capable of selectively reacting with a designated 
antigen or antigen family. Fab antibodies can be divided into subsets analogous to those 
described above, i.e., "vertebrate Fab", "hybrid Fab", "chimeric Fab", and "altered Fab". 
1 0 Methods of producing Fab fragments of antibodies are known within the art and 
include, for example, proteolysis, and synthesis by recombinant techniques. 

"Antigen-antibody complex" refers to the complex formed by an antibody that 
is specifically bound to an epitope on an antigen. 

"Immunogenic polypeptide" refers to a polypeptide that elicits a cellular and/or 
1 5 humoral immune response in a mammal, whether alone or linked to a carrier, in the 
presence or absence of an adjuvant. 

"Antigenic determinant" refers to the site on an antigen or hapten to which a 
specific antibody molecule or specific cell surface receptor binds. 

As used herein, "treatment" refers to any of (i) the prevention of infection or 
20 reinfection, as in a traditional vaccine, (ii) the reduction or elimination of symptoms, 

and (iii) the substantial or complete elimination of the pathogen in question. Treatment 
may be effected prophylactically (prior to infection) or therapeutically (following 
infection). 

By 'Vertebrate subject" is meant any member of the subphylum cordata, 
25 including, without limitation, humans and other primates, including non-human 

primates such as chimpanzees and other apes and monkey species; farm animals such 
as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; 
laboratory animals including rodents such as mice, rats and guinea pigs; birds, 
including domestic, wild and game birds such as chickens, turkeys and other 
30 gallinaceous birds, ducks, geese, and the like. The term does not denote a particular 
age. Thus, both adult and newborn individuals are intended to be covered. The 
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invention described herein is intended for use in any of the above vertebrate species, 
since the immune systems of all of these vertebrates operate similarly. 

II. Modesof Carrying out the Invention 
5 Before describing the present invention in detail, it is to be understood that this 

invention is not limited to particular formulations or process parameters as such may, of 
course, vary. It is also to be understood that the terminology used herein is for the 
purpose of describing particular embodiments of the invention only, and is not intended 
to be limiting. 

10 Although a number of compositions and methods similar or equivalent to those 

described herein can be used in the practice of the present invention, the preferred 
materials and methods are described herein. 

General Overview 

1 S An aim of an HCV vaccine is to generate broad immunity to a wide breadth of 

antigens because HCV is so divergent and because humoral as well as cellular immune 
responses are desirable to combat this human pathogen. While antibodies generated 
against the envelope glycoprotein(s) might aid in virus neutralization, there is 
additional benefit to be derived from a vaccine that includes other regions. The 

20 likelihood of T-helper responses generated against a polypeptide would be helpful in a 
vaccine setting as would generation of cytotoxic T cells. The non-structural region 
represents such a candidate antigen, but processing by the protease generates several 
polypeptides, making purification complicated. It would be advantageous, therefore, to 
derive a non-structural cassette that is unprocessed by the NS3 protease. 

25 The present invention solves this and other problems using compositions and 

methods involving an N-terminal deletion in NS3, which removes the catalytic domain. 
As such, some or all of the remainder of the non-structural region (through NSSB) is 
expressed as an intact polypeptide. Expression of this species has been documented in 
mammalian cells as well as in yeast. Further, in certain aspects, polynucleotides 

30 encoding HCV core polypeptides (or fragments thereof) are added (e.g,. operably 
linked) to the carboxy-terminus of the non-structural cassette. As the core coding 
region is relatively highly conserved among HCV isolates, the presence of this region 
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may enhance the immune response. Because core has at its C-terminus a very 
hydrophobic domain (amino acids 174-191), shorter versions of core were also 
engineered onto the polypeptide. As described in detail herein, the truncation of core to 
amino acid 121 yielded higher expression than the amino acid 173 truncation when 
5 engineered onto the C-terminus of the mutant NS polypeptide. The combination of 
most of the non-structural region fused to a C-terminally truncated core into a 
polypeptide is novel and has advantages for vaccine immunization. Moreover, because 
the aim is not necessarily to generate antibody responses to this polypeptide, there is no 
need to maintain a native conformation, enabling a more facile purification protocol. 

10 

Mutant HCV Non-Structural Polypeptides 

Genomes of HCV strains contain a single open reading frame of ^proximately 
9,000 to 12,000 nucleotides, which is transcribed into a polyprotein. An HCV 
polyprotein is cleaved to produce at least ten distinct products, in the order of NHj- 

1 5 Core-El -E2.p7-NS2-NS3-NS4a-NS4b-NS5a-NS5b.COOH. Mutant HCV 

polypeptides of the invention contain an N-terminal deletion in NS3, which removes or 
disables the catalytic domain. Preferably, the polypeptides also include the remainder 
of the non-structural region, although in certain embodiments, the polypeptides may 
include less than all of the remaining NS polypeptides, for example mutant NS 

20 polypeptides including any combinations of NS2-NS3-NS4a-NS4b-NS5a-NS5b (e.g., 
NS3NS3-NS5a-NS5b; NS3-NS4a-NS4b; NS3.NS4a-NS4b.NS5a; NS3.NS4b.NS5a- 
NS5b; NS3-NS4a.NS5a; NS3.NS4b-NS5a; NS3-NS4b-NS5b; etc.). 

The HCV NS3 protein functions as a protease and a helicase and occurs at 
approximately amino acid 1027 to amino acid 1657 of the polyprotein (numbered 

25 relative to HCV-1). See Choo et al (1991) Proc. Natl. Acad. Sci. USA 88:2451.2455. 
HCV NS4 occurs at approximately amino acid 1658 to amino acid 1972, NS5a occurs 
at approximately amino acid 1973 to amino acid 2420, and HCV NS5b occurs at 
approximately amino acid 2421 to amino acid 301 1 of the polyprotein (numbered 
relative to HCV-l) (Choo et al., 1991). 

30 The mutant polypeptides described herein can either be full-length polypeptides 

or portions of NS3, NS4 (NS4a and NS4b), NS5a, and NS5b polypeptides. Epitopes of 
NS3, NS4 (NS4a and NS4b), NS5a, NS5b, NS3NS4NS5a, and NS3NS4NS5aNS5b can 
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be identified by several methods. For example, NS3, NS4, NS5a, NS5b polypeptides 
or fusion proteins comprising any combination of the above, can be isolated, for 
example, by immunoaffinity purification using a monoclonal antibody for the 
polypeptide or protein. The isolated protein sequence can then be screened by 
5 preparing a series of short peptides by proteolytic cleavage of the purified protein, 
which together span the entire protein sequence. By starting with, for example, 
100-mer polypeptides, each polypeptide can be tested for the presence of epitopes 
recognized by a T cell receptor on an HCV-activated T cell, progressively smaller and 
overlapping Augments can then be tested fi^om an identified 100-mer to map the epitope 
10 of interest. 

Epitopes recognized by a T cell receptor on an HCV-activated T cell can be 
identified by, for example, ^'Cr release assay (see Example 2) or by 
lymphoproliferation assay (see Example 4). In a ^'Cr release assay, target cells can be 
constructed that display the epitope of interest by cloning a polynucleotide encoding the 

1 S epitope into an expression vector and transforming the expression vector into the target 
cells. Non-structural polypeptides can occur in any order in the fusion protein. If 
desired, at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more of one or more of the polypeptides 
may occur in the fusion protein. Multiple viral strains of HCV occur, and NS3, NS4, 
NS5a, and NS5b polypeptides of any of these strains can be used in a fusion protein. 

20 Nucleic acid and amino acid sequences of a number of HCV strains and 

isolates, including nucleic acid and amino acid sequences of NS3, NS4, NS5a, NS5b 
genes and polypeptides have been determined. For example, isolate HCV J 1.1 is 
described in Kubo et al (1989) Japan. Nucl. Acids Res. 17:10367-10372; Takeuchi et 
a/.(1990) Gene 91:287-291; Takeuchi et al (1990) J. Gen. Virol. 71:3027-3033; and 

25 Takeuchi et al (1 990) Nucl. Acids Res. 1 8:4626. The complete coding sequences of 
two independent isolates, HCV- J and BK, are described by Kato et al, (1990) Proc. 
Natl. Acad. Sci. USA 87:9524-9528 and Takamizawa et al, (1991) J. Virol. 
65: 1 105-1 113 respectively. 

Publications that describe HCV-1 isolates include Choo et al (1990) Brit. Med. 

30 Bull. 46:423-441; Choo et al (1991) Proc. Natl. Acad. Sci. USA 88:2451-2455 and 
Han et al (1991) Proc. Natl. Acad. Sci. USA 88:171 1-1715. HCV isolates HC-Jl and 
HC-J4 are described in Okamoto et al (1991) Japan J. Exp. Med. 60:167-177. HCV 
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isolates HCT 18-, HCT 23, Th, HCT 27, ECl and EClO are described in Weiner et al 
(1991) Virol 180:842-848. HCV isolates Pt-1, HCV-Kl and HCV-K2 are described in 
Enomoto et al (1990) Biochem. Biophys. Res. Commun. 170:1021-1025. HCV 
isolates A, C, D & E are described in Tsukiyama-Kohara et a/, (1991) Virus Genes 
5 5:243-254. 

Each of the mutant HCV polypeptides containing at least portions of NS3, NS4 
and NS5 can be obtained from the same HCV strain or isolate or from different HCV 
strains or isolates. Thus, each non-structural region of the polypeptide can be from the 
same HCV strain or isolate or from each different HCV strains or isolates. In addition 

10 to the mutant HCV non-structural polypeptides described herein, the proteins can 

contain other polypeptides derived from the HCV polyprotein. For example, it may be 
desirable to include polypeptides derived from the core region of the HCV polyprotein. 
This region occurs at amino acid positions 1-191 of the HCV polyprotein, numbered 
relative to HCV-1. Either the full-length protein or epitopes of the fiilMength protein 

15 may be used in the subject fusions, such as those epitopes found between amino acids 
10-53, amino acids 10-45, amino acids 67-88, amino acids 120-130, or any of the core 
epitopes identified in, e.g., Houghton et al., U.S. Patent No. 5,350,671; Chien et al., 
Proc. Natl Acad, ScL USA (1992) 89:1001 1-10015; Chien et al., J. Gastroent Hepatol 
(1993) 8:S33-39; Chien et al., International Publication No. WO 93/00365; Chien, 

20 D.Y., International Publication No. WO 94/01 778; and commonly owned, U.S. Patent 
No. 6,150,087. When present, additional non-structural HCV polypeptides such as core 
can be obtained from the same HCV strain or isolate or from different HCV strains or 
isolates. 

Preferably, the above-described mutant proteins, as well as the individual 
25 components of these proteins, are produced recombinantly. A polynucleotide encoding 
these proteins can be introduced into an expression vector which can be expressed in a 
suitable expression system. A variety of bacterial, yeast, manmialian, insect and plant 
expression systems are available in the art and any such expression system can be used. 
Optionally, a polynucleotide encoding these proteins can be translated in a cell-free 
30 translation system. Such methods are well known in the art. The proteins also can be 
constructed by solid phase protein synthesis. 
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If desired, the mutant polypeptides, or the individual components of these 
polypeptides, also can contain other amino acid sequences, such as amino acid linkers 
or signal sequences, as well as ligands useful in protein purification, such as 
glutathione-S-transferase and staphylococcal protein A. 

5 

Polynucleotides 

The polynucleotides of the present invention are not necessarily physically 
derived from the nucleotide sequences shown, but can be generated in any manner, 
including, for example, chemical synthesis or DNA replication or reverse transcription 
10 or transcription. In addition, combinations of regions corresponding to that of the 

designated sequences can be modified in ways known to the art to be consistent with an 
intended use. 

The DNA encoding the desired polypeptide, whether in fused or mature form, 
and whether or not containing a signal sequence to permit secretion, can be ligated into 

1 5 expression vectors suitable for any convenient host. Both eukaryotic and prokaryotic 
host systems are presently used in forming recombinant polypeptides, and a summary 
of some of the more common control systems and host cell is given below. The 
polypeptide produced in such host cells is then isolated from lysed cells or from the 
culture medium and purified to the extent needed for its intended use. 

20 Purification can be by techniques known in the art, for example, differential 

extraction, salt fractionation, chromatography on ion exchange resins, affinity 
chromatography, centrifugation, alkali resolubilization of insoluble protein, and the 
like. See, for example. Methods in Enzymology for a variety of methods for purifying 
proteins. 

25 Polynucleotides contain less than an entire HCV genome and can be RNA or 

single- or double-stranded DNA. Preferably, the polynucleotides are isolated free of 
other components, such as proteins and lipids. Polynucleotides of the invention can 
also comprise other nucleotide sequences, such as sequences coding for linkers, signal 
sequences, or ligands useful in protein ptuification such as glutathione-S-transferase 

30 and staphylococcal protein A. 

Polynucleotides encoding mutant HCV non-structural polypeptides can be 
isolated from a genomic library derived from nucleic acid sequences present in, for 
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example, the plasma, serum, or liver homogenate of an HCV infected individual or can 
be synthesized in the laboratory, for example, using an automatic synthesizer. An 
amplification method such as PGR can be used to amplify polynucleotides fix>m either 
HCV genomic DNA or cDNA. 
5 Further, while the polypeptides that are not NS3, NS4, or NS5 of HCV of the 

present invention can comprise a substantially complete viral domain, in many 
applications all that is required is that the polypeptide comprise an antigenic or 
immunogenic region of the virus. An antigenic region of a polypeptide is generally 
relatively small-typically 8 to 10 amino acids or less in length. Fragments of as few as S 

10 amino acids can characterize an antigenic region. These segments can correspond to 

regions of, for example, C, El, or E2 epitopes. Accordingly, using the cDNAs of C, El, 
or E2 as a basis, DNAs encoding short segments of C, El , or E2 polypeptides can be 
expressed recombinantly either as fusion proteins, or as isolated polypeptides. In 
addition, short amino acid sequences can be conveniently obtained by chemical 

15 synthesis. 

Polynucleotides encoding the polypeptides described herein can comprise 
coding sequences for these polypeptides which occur naturally or can be artificial 
sequences which do not occur in nature. These polynucleotides can be ligated to form a 
coding sequence for the fusion proteins using standard molecular biology techniques. 

20 If desired, polynucleotides can be cloned into an expression vector and transformed 
into, for example, bacterial, yeast, insect, plant or mammalian cells so that the fusion 
proteins of the invention can be expressed in and isolated from a cell culture. 

The expression of polypeptides containing these domains in a variety of 
recombinant host cells, including, for example, bacteria, yeast, insect, plant and 

25 vertebrate cells, give rise to important immunological reagents which can be used for 
diagnosis, detection, and vaccines. 

The general techniques used in extracting the genome from a virus, preparing 
and probing a cDNA library, sequencing clones, constructing expression vectors, 
transforming cells, performing immunological assays such as radioimmunoassays and. 

30 ELISA assays, for growing cells in culture, and the like are known in the art and 

laboratory manuals are available describing these techniques. However, as a general 
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guide, the following sets forth some sources currently available for such procedures^ 
and for materials useful in carrying them out. 

Both prokaryotic and eukaryotic host cells may be used for expression of 
desired coding sequences when appropriate control sequences which are compatible 
5 with the designated host are used. Among prokaryotic hosts, E. coli is most frequently 
used. Expression control sequences for prokaryotes include promoters, optionally 
containing operator portions, and ribosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid 
containing operons conferring ampicillin and tetracycline resistance, and the various 

10 pUC vectors, which also contain sequences conferring antibiotic resistance markers. 
These markers may be used to obtain successful transformants by selection. Commonly 
used prokaryotic control sequences include the Beta-lactamase (penicillinase) and 
lactose promoter systems (Chang et al. (1977), Nature 198:1056), the tryptophan (trp) 
promoter system (Goeddel et al. (1980) Nucleic Acid Res. 8:4057), the lambda-derived 

IS P[L ]promoter and N gene ribosome binding site (Shimatake et al. (1981) Nature 
292:128) and the hybrid tac promoter (De Boer et al. (1983) Proc. Natl. Acad. Sci. 
U.S.A. 292:128) derived bom sequences of the trp and lac UV5 promoters. The 
foregoing systems are particularly compatible with E. coli; if desired, other prokaryotic 
hosts such as strains of Bacillus or Pseudomonas may be used, with corresponding 

20 control sequences. 

Eukaryotic hosts include mammalian and yeast cells in culture systems. 
Mammalian cell lines available as hosts for expression are known in the art and include 
many immortalized cell lines available from the American Type Culture Collection 
(ATCC), including HeLa cells, Chinese hamster ovary (CHO) cells, baby hamster 

25 kidney (BHK) cells, and a number of other cell lines. Suitable promoters for 

manunalian cells are also known in the art and include viral promoters such as that 
from Simian Virus 40 (SV40) (Fiers (1978), Nature 273:1 13), Rous sarcoma virus 
(RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may 
also require terminator sequences and poly A addition sequences; enhancer sequences 

30 which increase expression may also be included, and sequences which cause 

amplification of the gene may also be desirable. These sequences are known in the art. 
Vectors suitable for replication in mammalian cells may include viral replicons, or 
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sequences which insure integration of the appropriate sequences encoding NANBV 
epitopes into the host genome. 

The vaccinia virus system can also be used to express foreign DNA in 
mammalian cells. To express heterologous genes, the foreign DNA is usually inserted 
5 into the thymidine kinase gene of the vaccinia virus and then infected cells can be 
selected. This procedure is known in the art and further information can be found in 
these references (Mackett et al. J. Virol. 49: 857-864 (1984) and Chapter 7 in DNA 
Cloning, Vol. 2, IRL Press). 

Yeast expression systems are also known to one of ordinary skill in the art. A 

10 yeast promoter is any DNA sequence capable of binding yeast RNA polymerase and 
initiating the downstream (3*) transcription of a coding sequence (e.g., structural gene) 
into mRNA. A promoter will have a transcription initiation region which is usually 
placed proximal to the 5' end of the coding sequence. This transcription initiation 
region usually includes an RNA polymerase binding site (the "TATA Box") and a 

1 S transcription initiation site. A yeast promoter may also have a second domain called an 
upstream activator sequence (UAS), which, if present, is usually distal to the structural 
gene. The UAS permits regulated (inducible) expression. Constitutive expression 
occurs in the absence of a UAS. Regulated expression may be either positive or 
negative, thereby either enhancing or reducing transcription. 

20 Yeast is a fermenting organism with an active metabolic pathway, therefore 

sequences encoding enzymes in the metabolic pathway provide particularly useful 
promoter sequences. Examples include alcohol dehydrogenase (ADH) (EP-A-0 284 
044), enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3- 
phosphate-dehydrogenase (GAP or GAPDH), hexokinase, phosphofhictokinase, 3- 

25 phosphoglycerate mutase, and pyruvate kinase (PyK) (EPO-A-0 329 203). The yeast 
PH05 gene, encoding acid phosphatase, also provides useful promoter sequences 
(Myanoharaera/. (1983) Proc. Natl Acad. ScL USA 80:1), 

In addition, synthetic promoters which do not occur in nature also function as 
yeast promoters. For example, UAS sequences of one yeast promoter may be joined 

30 with the transcription activation region of another yeast promoter, creating a synthetic 
hybrid promoter. Examples of such hybrid promoters include the ADH regulatory 
sequence linked to the GAP transcription activation region (US Patent Nos. 4,876,197 
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and 4,880,734). Other examples of hybrid promoters include promoters which consist 
of the regulatory sequences of either the ADH2y GAL4y GAL 10, OR PH05 genes, 
combined with the transcriptional activation region of a glycolytic enzyme gene such as 
GAP or PyK (EP-A-0 164 556). Furthermore, a yeast promoter can include naturally 
5 occurring promoters of non-yeast origin that have the ability to bind yeast RNA 

polymerase and initiate transcription. Examples of such promoters include, inter alia, 
(Cohen e/fl/. (1980) Prac. NatL Acad. Sci. 77:1078; Henikoff era/. (1981) 
Nature 283:^35; nonenh^Tget aL (1981)Cwrr. Topics Microbiol, Immunol 96:119; 
HoUenberg et al. (1979) "The Expression of Bacterial Antibiotic Resistance Genes in 

10 the Yeast Saccharomyces cerevisiae," in: Plasmids of Medical, Environmental and 

Commercial Importance K.N. TimmisandA. Puhler); Mercerau-Puigalon a/. 
(1980) Ge/ie 77:163; Panthiere/ a/. (1980) Cwrr. Genet. 2:109). 

A DNA molecule may be expressed intracellularly in yeast. A promoter 
sequence may be directly linked with the DNA molecule, in which case the first amino 

1 5 acid at the N-terminus of the recombinant protein will always be a methionine, which is 
encoded by the ATG start codon. If desired, methionine at the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as 
in mammalian, baculovirus, and bacterial expression systems. Usually, a DNA 

20 sequence encoding the N-terminal portion of an endogenous yeast protein, or other 
stable protein, is fused to the 5* end of heterologous coding sequences. Upon 
expression, this construct will provide a fusion of the two amino acid sequences. For 
example, the yeast or human superoxide dismutase (SOD) gene, can be linked at the 5' 
terminus of a foreign gene and expressed in yeast. The DNA sequence at the junction 

25 of the two amino acid sequences may or may not encode a cleavable site. See e.g., EP- 
A-0 1 96 056. Another example is a ubiquitin fusion protein. Such a fusion protein is 
made with the ubiquitin region that preferably retains a site for a processing enzyme 
{e.g., ubiquitin-specific processing protease) to cleave the ubiquitin from the foreign 
protein. Through this method, therefore, native foreign protein can be isolated (e.g., 

30 WO88/024066). 

Alternatively, foreign proteins can also be secreted from the cell into the growth 
media by creating chimeric DNA molecules that encode a fusion protein comprised of a 
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leader sequence fragment that provide for secretion in yeast of the foreign protein. 
Preferably, there are processing sites encoded between the leader fragment and the 
foreign gene that can be cleaved either in vivo or in vitro. The leader sequence 
fragment usually encodes a signal peptide comprised of hydrophobic amino acids 
5 which direct the secretion of the protein from the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted 
yeast proteins, such as the yeast invertase gene (EP-A-0 012 873; JPO. 62,096,086) 
and the A-factor gene (US patent 4,588,684). Alternatively, leaders of non-yeast 
origin, such as an interferon leader, exist that also provide for secretion in yeast (EP-A- 
10 0 060 057). 

A preferred class of secretion leaders are those that employ a fragment of the 
yeast alpha-factor gene, which contains both a "pre" signal sequence, and a "pro" 
region. The types of alpha-factor fragments that can be employed include the full* 
length pre-pro alpha factor leader (about 83 amino acid residues) as well as truncated 

15 alpha-factor leaders (usually about 25 to about 50 amino acid residues) (US Patents 
4,546,083 and 4,870,008; EP-A-0 324 274). Additional leaders employing an alpha- 
factor leader fragment that provides for secretion include hybrid alpha-factor leaders 
made with a presequence of a first yeast, but a pro-region from a second yeast 
alphafactor. (e.g., see WO 89/02463.) 

20 Usually, transcription termination sequences recognized by yeast are regulatory 

regions located 3' to the translation stop codon, and thus together with the promoter 
flank the coding sequence. These sequences direct the transcription of an mRNA 
which can be translated into the polypeptide encoded by the DNA. Examples of 
transcription terminator sequence and other yeast-recognized termination sequences, 

25 such as those coding for glycolytic enzymes. 

Usually, the above described components, comprising a promoter, leader (if 
desired), coding sequence of interest, and transcription termination sequence, are put 
together into expression constructs. Expression constructs are often maintained in a 
replicon, such as an extrachromosomal element (e.g., plasmids) capable of stable 

30 maintenance in a host, such as yeast or bacteria. The replicon may have two replication 
systems, thus allowing it to be maintained, for example, in yeast for expression and in a 
prokaryotic host for cloning and amplification. Examples of such yeast-bacteria shuttle 
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vectors include YEp24 (Botstein et al (1979) Gene 5:17-24), pCl/1 (Brake et al. 
(1984) Proc. Natl. Acad, 5:ci LW!^ 57:4642-4646), and YRp 17 (Stinchcomb a/. 
(1982)7. Mol. Biol. 755:157). In addition, a replicon may be either a high or low 
copy number plasmid. A high copy number plasmid will generally have a copy number 
5 ranging from about 5 to about 200, and usually about 10 to about 150. A host 

containing a high copy number plasmid will preferably have at least about 10, and more 
preferably at least about 20. Enter a high or low copy number vector may be selected, 
depending upon the effect of the vector and the foreign protein on the host. See e.g.. 
Brake et al.^ supra. 

10 Alternatively, the expression constructs can be integrated into the yeast genome 

with an integrating vector. Integrating vectors usually contain at least one sequence 
homologous to a yeast chromosome that allows the vector to integrate, and preferably 
contain two homologous sequences flanking the expression construct. Integrations 
appear to result from recombinations between homologous DNA in the vector and the 

15 yeast chromosome (Orr- Weaver er a/. (19%3) Methods in EnzymoL 707:228-245). An 
integrating vector may be directed to a specific locus in yeast by selecting the 
appropriate homologous sequence for inclusion in the vector. See Onr- Weaver et a/., 
supra. One or more expression construct may integrate, possibly affecting levels of 
recombinant protein produced (Rine a/. (1983) Proc. Natl, Acad. Sci. USA 

20 50:6750). The chromosomal sequences included in the vector can occur either as a 

single segment in the vector, which results in the integration of the entire vector, or two 
segments homologous to adjacent segments in the chromosome and flanking the 
expression construct in the vector, which can result in the stable integration of only the 
expression construct. 

25 Usually, extrachromosomal and integrating expression constructs may contain 

selectable markers to allow for the selection of yeast strains that have been transformed. 
Selectable markers may include biosynthetic genes that can be expressed in the yeast 
host, such as ADE2, HIS4, LEU2, TRPl, and ALG7, and the G41 8 resistance gene, 
which confer resistance in yeast cells to tunicamycin and G41 8, respectively. In 

30 addition, a suitable selectable marker may also provide yeast with the ability to grow in 
the presence of toxic compounds, such as metal. For example, the presence of CUPl 
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allows yeast to grow in the presence of copper ions (Butt et aL (1 987) Microbiol, Rev. 
57:351). 

Alternatively, some of the above described components can be put together into 
transformation vectors. Transformation vectors are usually comprised of a selectable 
5 marker that is either maintained in a replicon or developed into an integrating vector, as 
described above. 

Expression and transformation vectors, either extrachromosomal replicons or 
integrating vectors, have been developed for transformation into many yeasts. For 
example, expression vectors have been developed for, inter alia^ the following yeasts: 

10 Candida albicans (Kurtz, e/ a/. (1986) M>/. Cell Biol 5:142), Candida maltosa 

(Kimze, e/a/. (1985)7. Basic Microbiol 25:141). Hansenula polymorpha (Gleeson, 
etal (1986)/ Gen. Microbiol i 52:3459; Roggenkamp e/ a/. (1986) Afo/. Gen. 
Genet 202:302), Kluyveromyces fragilis (Das, a/. (1984)7. Bacteriol 755:1165), 
Kluyveromyces lactis (De Louvencourt eM/. (1983)7 Bacteriol 75^:737; Van den 

15 Berg et al (1990) Bio/Technology 5:135), Pichia guillerimondii (Kunze et al (1985) 
7 Basic Microbiol 25:141), Pichia pastoris (Cregg, e/ a/. (1985) Afo/. Cell Biol 
5:3376; US Patent Nos. 4,837,148 and 4,929,555), Saccharomyces cerevisiae (Hinnen 
etal (1978) Proc. Natl Acad Sci. USA 75:1929; Ito etal (1983)7 Bacteriol 
75i:163), Schizosaccharomyces pombe (Beach and Nurse (1981) Nature 500:706), and 

20 Yarrowia lipolytica (Davidow, a/. (1985) Cwrr. Genet. 70:380471 Gaillardin, e/ a/. 
(1985) Curr. Genet. 70:49). 

Methods of introducing exogenous DNA into yeast hosts are well-known in the 
art, and usually include either the transformation of spheroplasts or of intact yeast cells 
treated with alkali cations. Transformation procedures usually vary with the yeast 

25 species to be transformed. (See e.g., Kurtz a/. (1986) Afo/. Cell Biol 5:142; 
Kunzee/a/. (1985)7 Basic Microbiol 25:141; Candida; Gleeson a/. (1986)7 
Gen. Microbiol 752:3459; Roggenkamp a/. (1986) M)/. Gen. Genet. 202:302; 
Hansenula; Das a/. (1984)7 Bacteriol 755:1165; DeLouvencourt era7. (1983)7 
Bacteriol 754:1 165; Van den Berg et al (1990) Bio/Technology 5:135; 

30 Kluyveromyces;Creggerfl/. (1985) M)/. Cell Biol 5:3376; Kunze a/. (1985)7 
Basic Microbiol 25:141; US Patent Nos. 4,837,148 and 4,929,555; Pichia; ffinnener 
al (1978) Prac. Natl Acad. ScL USA 75\l929;lloet al (1983)7 Bacteriol 
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755:163 Saccharomyces; Beach and Nurse (1981) Nature 300:706; 
Schizosaccharomyces; Davidow et al (1985) Cum Genet 70:39; Gaillardin et al. 
(1985) Cwrr. Genet 70:49; Yarrowia). 

Bacterial expression techniques are known in the art. A bacterial promoter is 
5 any DNA sequence capable of binding bacterial RNA polymerase and initiating the 
downstream (3') transcription of a coding sequence (e.g., structural gene) into mRNA. 
A promoter will have a transcription initiation region which is usually placed proximal 
to the 5* end of the coding sequence. This transcription initiation region usually 
includes an RNA polymerase binding site and a transcription initiation site. A bacterial 

1 0 promoter may also have a second domain called an operator, that may overlap an 

adjacent RNA polymerase binding site at which RNA synthesis begins. The operator 
permits negative regulated (inducible) transcription, as a gene repressor protein may 
bind the operator and thereby inhibit transcription of a specific gene. Constitutive 
expression may occur in the absence of negative regulatory elements, such as the 

1 5 operator. In addition, positive regulation may be achieved by a gene activator protein 
binding sequence, which, if present is usually proximal (5') to the RNA polymerase 
binding sequence. An example of a gene activator protein is the catabolite activator 
protein (CAP), which helps initiate transcription of the lac operon in Escherichia coli 
(E. coli) (Raibaud eM/. (1984)i4witii. 7?ev. Genet 18:173), Regulated expression 

20 may therefore be either positive or negative, thereby either enhancing or reducing 
transcription. 

Expression and transformation vectors, either extra-chromosomal replicons or 
integrating vectors, have been developed for transformation into many bacteria. For 
example, expression vectors have been developed for, inter alia^ the following bacteria: 

25 Bacillus subtilis (Palva eM/. (1982) Proc. Natl Acad, Set USA 79:55S2; EP-A-O 
036 259 and EP-A-0 063 953; WO 84/04541), Escherichia coli (Shunatake et at 
{\9U) Nature 292:n^\Am?mnetal (1985) Gewe 40:183; Studier^/ a/. (1986)7. 
Mol Biol 7^9:1 13; EP-A-0 036 776,EP-A-0 136 829 and EP-A-0 136 907), 
Streptococcus cremoris (Powell fl/. {\99X)Appl Environ. Microbiol 54:655); 

30 Streptococcus lividans (Powell e/ fl/. (\9%Z)Appl Environ, Microbiol 54:655), 
Streptomyces lividans (US patent 4,745,056). 
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Methods of introducing exogenous DNA into bacterial hosts are well-known in 
the art, and usually include either the transformation of bacteria treated with CaCl2 or 
other agents, such as divalent cations and DMSO. DNA can also be introduced into 
bacterial cells by electroporation. Transformation procedures usually vary with the 
5 bacterial species to be transformed. (See e.g., Masson et ai (1989) FEMS Microbiol 
Lett 50:273; Palvae/ a/. (1982) Proc. Natl Acad. ScL 7P:5582; EP-A-0 036 
259 and EP-A-0 063 953; WO 84/04541, Bacillus, Miller e/fl/. (1988) Proc. Natl 
Acad. ScL 55:856; Wang a/. (1990)7. Bacteriol 7 72:949; Campylobacter, Cohen 
etal (1973) Proc. Natl Acad ScL 6P:2110; Dower e/a/. {19SB) Nucleic Acids Res. 

10 7(5:6127; Kushner (1 978) "An improved method for transformation of Escherichia coli 
with ColEl -derived plasmids. In Genetic Engineering: Proceedings of the 
International Symposium on Genetic Engineering (eds, H.W. BoyerandS. Nicosia); 
MandeleM/. (1970)7. Mol Biol 55:159; Taketo (1988) BiocAiw. Biophys. Acta 
9-/P:3 18; Escherichia; ChassyeM/. (mi) FEMS Microbiol Lett. 44:173 

15 Lactobacillus; Fiedler et al (1988) Anal Biochem 770:38, Pseudomonas; Augustin et 
al (1990) FEMS Microbiol Lett, 5(5:203, Staphylococcus, Barany a/. (1980)7. 
Bacteriol J44:69S\ Harlander (1987) "Transformation of Streptococcus lactis by 
electroporation, in: 5rreptococcfl/ Ge/ieric5 (ed. J. Ferretti and R. Curtiss HI); Perry 
al (1961) Infect. Immun. 32:1295; Powell a/. i\9SS) Appl Environ. Microbiol 

20 54:655; Somkuti et al. (1987) Proc. 4th Evr. Cong. Biotechnology 7:412, 
Streptococcus). 

In addition, viral antigens can be expressed in insect cells by the Baculovirus 
system. A general guide to Baculovirus expression by Summer and Smith is A Manual 
of Methods for Baculovirus Vectors and Insect Cell Culture Procedures (Texas 

25 Agricultural Experiment Station Bulletin No. 1 555). To incorporate the heterologous 
gene into the Baculovirus genome the gene is first cloned into a transfer vector 
containing some Baculovirus sequences. This transfer vector, when it is cotransfected 
with wild-type virus into insect cells, will recombine with the wild-type virus. Usually, 
the transfer vector will be engineered so that the heterologous gene will disrupt the 

30 wild-type Baculovirus polyhedron gene. This disruption enables easy selection of the 
recombinant virus since the cells infected with the recombinant virus will appear 
phenotypically different from the cells infected with the wild-type virus. The purified 
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recombinant virus can be used to infect cells to express the heterologous gene. The 
foreign protein can be secreted into the medium if a signal peptide is linked in frame to 
the heterologous gene; otherwise, the protein will be bound in the cell lysates. For 
further information, see Smith et al Mol. & Cell. Biol. 3:2156-2165 (1983) or Luckow 
5 and Summers in Virology 17: 31-39 (1989). 

Baculovirus expression can also be affected in plant cells. There are many plant 
cell culture and whole plant genetic expression systems known in the art. Exemplary 
plant cellular genetic expression systems include those described in patents, such as: 
US 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of genetic 

10 expression in plant cell culture has been described by Zenk, Phytochemistry 30:3861- 
3863 (1991). Descriptions of plant protein signal peptides may be found in addition to 
the references described above in Vaulcombe et al., A/b/. Gen. Genet. 209:33-40 
(1987); Chandler et al.. Plant Molecular Biology 3:407-418 (1984); Rogers, J. Biol 
Chem. 260:3731-3738 (1985); Rothstein et al., Gene 55:353-356 (1987); Whittier et 

15 al.. Nucleic Acids Research 15:2515-2535 (1987); Wirsel et al.. Molecular 

Microbiology 3:3-14 (1989); Yu et al.. Gene 122:247-253 (1992). A description of the 
regulation of plant gene expression by the phytohormone, gibberellic acid and secreted 
enzymes induced by gibberellic acid can be found in R.L. Jones and J. M acMillin, 
Gibberellins: in: Advanced Plant Physiology^, Malcohn B. Wilkins, ed., 1984 Pitman 

20 Publishing Limited, London, pp. 21-52. References that describe other metaboHcally- 
regulated genes: Sheen, Plant Cell, 2:1027-1038(1990); Maas et al., EMBOJ, 9:3447- 
3452 (1990); Benkel and Hickey,Proc. Natl Acad, ScL 84:1337-1339(1987). 

All plants from which protoplasts can be isolated and cultured to give whole 
regenerated plants can be transformed by the present invention so that whole plants are 

25 recovered which contain the transferred gene. It is known that practically all plants can 
be regenerated from cultured cells or tissues, including but not limited to all major 
species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. 
Some suitable plants include, for example, species from the genera Fragaria^ Lotus, 
Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, 

30 Manihoty DaucuSy Arabidopsis, Brassica^ RaphanuSy SinapiSy Atropa, Capsicuniy 

Datura, Hyoscyamus, Lycopersion, NicotianOy Solanum, Petunia, Digitalis, Majorana, 
Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, 
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Nemesia, Pelargonium, Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, 
CucumiSy Browaalia, Glycine^ Lolium, Zea, Triticum, Sorghum, and Datura. 

Transformation can be by any method for introducing poljrnucleotides into a 
host cell, including, for example packaging the polynucleotide in a virus and 
5 transducing a host cell with the virus, and by direct uptake of the polynucleotide. The 
transformation procedure used depends upon the host to be transformed. Bacterial 
transformation by direct uptake generally employs treatment with calcium or rubidium 
chloride (Cohen (1972), Proc. Natl. Acad. Sci. U.S.A. 69:21 10; Maniatis et al. (1982), 
MOLECULAR CLONING; A LABORATORY MANUAL (Cold Spring Harbor Press, 

10 Cold Spring Harbor, N.Y.). Yeast transformation by direct uptake may be carried out 
using the method of Hiraien et al. (1978) Proc. Natl. Acad. Sci. U.S.A. 75: 1929. 
Mammalian transformations by direct uptake may be conducted using the calcium 
phosphate pr^ipitation method of Graham and Van der Eb (1978), Virology 52:546 or 
the various known modifications thereof. 

1 5 Vector construction employs techniques which are known in the art. Site- 

specific DNA cleavage is performed by treating with suitable restriction enzymes under 
conditions which generally are specified by the manufacturer of these conmiercially 
available enzymes. The cleaved firagments may be separated using polyacrylamide or 
agarose gel electrophoresis techniques, according to the general procedures found in 

20 Methods in Enzymology (1980) 65:499-560. Sticky ended cleavage fragments may be 
blunt ended using E. coli DNA polymerase I (Klenow) in the presence of the 
appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment 
with SI nuclease may also be used, resulting in the hydrolysis of any single stranded 
DNA portions. 

25 Ligations are carried out using standard buffer and temperature conditions using 

T4 DNA ligase and ATP; sticky end ligations require less ATP and less ligase than 
blunt end ligations. When vector fi-agments are used as part of a ligation mixture, the 
vector firagment is often treated with bacterial alkaline phosphatase (BAP) or calf 
intestinal alkaline phosphatase to remove the 5'-phosphate and thus prevent religation 

30 of the vector, alternatively, restriction enzyme digestion of unwanted firagments can be 
used to prevent ligation. Ligation mixtures are transformed into suitable cloning hosts. 
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such as E. coli, and successful transformants selected by, for example, antibiotic 
resistance, and screened for the correct construction. 

Synthetic oligonucleotides may be prepared using an automated oligonucleotide 
synthesizer as described by Warner (1984), DNA 3:401. If desired, the synthetic strands 
5 may be labeled with ^^P by treatment with polynucleotide kinase in the presence of ^^P- 
ATP, using standard conditions for the reaction. DNA sequences, including those 
isolated from cDNA libraries, may be modified by known techniques, including, for 
example site directed mutagenesis, as described by ZoUer (1982), Nucleic Acids Res. 
10:6487. 

1 0 The expression constructs of the present invention, including the desired fusion, 

or individual expression constructs comprising the individual components of these 
fusions, may be used for nucleic acid immunization, to activate HCV-specific T cells, 
using standard gene delivery protocols. Methods for gene delivery are known in the 
art. See,e.g.,U.S.PatentNos. 5,399,346,5,580,859, 5,589,466. Genes can be 

15 delivered either directly to the vertebrate subject or, alternatively, delivered ex vivo^ to 
cells derived from the subject and the cells reimplanted in the subject. For example, the 
constructs can be delivered as plasmid DNA, e.g., contained within a plasmid, such as 
pBR322,pUC,orColEl 

Additionally, the expression constructs can be packaged in liposomes prior to 

20 delivery to the cells. Lipid encapsulation is generally accomplished using liposomes 
which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed 
DNA to lipid preparation can vary but will generally be around 1:1 (mg 
DNA:micromoles lipid), or more of lipid. For a review of the use of liposomes as 
carriers for delivery of nucleic acids, see. Hug and Sleight, Biochim, Biophys, Acta, 

IS (1991) 1097 :1-17: Straubinger et al., in Methods ofEnzymology (1983), Vol. 101, pp. 
512-527. 

Liposomal preparations for use with the present invention include cationic 
(positively charged), anionic (negatively charged) and neutral preparations, with 
cationic liposomes particularly preferred. Cationic liposomes are readily available. For 
30 example, N[ l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes 
are available under the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. 
(See, also, Feigner et al., Proc, Natl Acad, ScL USA (1987) 84:7413-7416). Other 
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commercially available lipids include transfectace (DDAB/DOPE) and DOTAP/DOPE 
(Boerhinger). Other cationic liposomes can be prepared from readily available 
materials using techniques well known in the art. See, e.g., Szoka et al., Proc. Natl 
Acad. ScL USA (1978) 75:4194-4198; PCT Publication No. WO 90/1 1092 for a 
5 description of the synthesis of DOTAP ( 1 ,2-bis(oleoyloxy)-3- 

(trimethylammonio)propane) liposomes. The various liposome-nucleic acid complexes 
are prepared using methods known in the art. See, e.g., Straubinger et al., in 
METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al., Proc. 
Natl, Acad. ScL USA (1978) 75:4194-4198; Papahadjopoulos et al., Biochim. Biophys. 

1 0 Acta (1975) 324:483; Wilson et al.. Cell (1979) 17:77); Deamer and Bangham, 

Biochim, Biophys. Acta (1976) 443:629; Ostro et al.yBiochem. Biophys, Res. Commun. 
(1977) 76:836; Fraley et al., Proc. Natl Acad. Sci. USA (1979) 76:3348); Enoch and 
Strittmatter, Proc. Natl Acad. ScL USA (1979) 76:145); Fraley et al., J. Biol Chem. 
(1980) 255:10431; Szoka and Papahadjopoulos, Proc. Natl Acad. ScL USA (1978) 

15 75:145; and Schaefer-Ridder et al., Science (1982) 215:166. 

The DNA can also be delivered in cochleate lipid compositions similar to those 
described by Papahadjopoulos et al., Biochem. Biophys. Acta. (1975) 394:483-491. 
See, also, U.S. Patent Nos. 4,663,161 and 4,871,488. 

A number of viral based systems have been developed for gene transfer into 

20 manunalian cells. For example, retroviruses provide a convenient platform for gene 
delivery systems, such as murine sarcoma virus, mouse mammary timior virus, 
Moloney murine leukemia virus, and Rous sarcoma virus. A selected gene can be 
inserted into a vector and packaged in retroviral particles using techniques known in the 
art. The recombinant virus can then be isolated and delivered to cells of the subject 

25 either in vivo or ex vivo. A number of retroviral systems have been described (U.S. 
Patent No. 5,219,740; Miller and Rosman, BioTechniques (1989) 7:980-990; Miller, 
A.D., Human Gene Therapy (1990) 1:5-14; Scarpa et al., Virology (1991) 180:849-852: 
Bums et al., Proc. Natl Acad. ScL USA (1993) 90:8033-8037; and Boris-Lawrie and' 
Temin, Cur. Opin. Genet Develop. (1993) 3:102-109. Briefly, retroviral gene delivery 

30 vehicles of the present invention may be readily constructed from a wide variety of 
retroviruses, including for example, B, C, and D type retroviruses as well as 
spumavinises and lentiviruses such as FIV, HIV, HIV-1, HIV-2 and SIV (see RNA 
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Tumor Viruses, Second Edition, Cold Spring Harbor Laboratory, 1985). Such 
retroviruses may be readily obtained fix)m depositories or collections such as the 
American Type Culture Collection ("ATCC"; 10801 University Blvd., Manassas, VA 
201 10-2209), or isolated from known sources using commonly available techniques. 
5 A number of adenovirus vectors have also been described, such as adenovirus 

Type 2 and Type 5 vectors. Unlike retroviruses which integrate into the host genome, 
adenoviruses persist extrachromosomally thus minimizing the risks associated with 
insertional mutagenesis (Haj -Ahmad and Graham, J. ViroL (1986) 57:267-274; Bett et 
al.,J. ViroL (1993) 67:591 1-5921; Mitterederet al., //li/wa/i Ge/ie 77iera/?>; (1^^^ 

10 5:717-729; Seth et al., J. ViroL (1994) 68:933-940; Barr et al.. Gene Therapy (1994) 
i:51-58; Berkner, K.L. BioTechniques (1988) 6:616-629; and Rich et al., Human Gene 
Therapy (1993) 4:461-476). 

Molecular conjugate vectors, such as the adenovirus chimeric vectors described 
in Michael et al., J. BioL Chem. (1993) 268:6866-6869 and Wagner et al., Proc. NatL 

15 Acad. ScL USA (1992) 82:6099-6103, can also be used for gene delivery. 

Members of the Alphavirus genus, such as but not limited to vectors derived 
from the Sindbis and Semliki Forest viruses, VEE, will also find use as viral vectors for 
delivering the gene of interest. For a description of Sindbis-virus derived vectors useful 
for the practice of the instant methods, see, Dubensky et al, J. ViroL (1996) 70:508- 

20 519; and International Publication Nos. WO 95/07995 and WO 96/1 7072. 

Other vectors can be used, including but not limited to simian virus 40, 
cytomegalovirus. Bacterial vectors, such as Salmonella ssp. Yersinia enterocolitica. 
Shigella spp., Vibrio cholerae^ Mycobacterium strain BCG, and Listeria 
monocytogenes can be used. Minichromosomes such as MC and MCI, bacteriophages, 

25 cosmids (plasmids into which phage lambda cos sites have been inserted) and replicons 
(genetic elements that are capable of replication imder their own control in a cell) can 
also be used. 

The expression constructs may also be encapsulated, adsorbed to, or associated 
with, particulate carriers. Such canriers present multiple copies of a selected molecule 
30 to the inunune system and promote trapping and retention of molecules in local lymph 
nodes. The particles can be phagocytosed by macrophages and can enhance antigen 
presentation through cytokine release. Examples of particulate carriers include those 
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derived from polymethyl methacrylate polymers, as well as micrcparticles derived from 
poly(lactides) and poly(lactide-co-glycolides), known as PLC See, e.g., Jeffery et al., 
Pharm, Res, (1993) 10:362-368; and McGee et al., J. Microencap, (1996). 

A wide variety of other methods can be used to deliver the expression 
5 constructs to cells. Such methods include DEAE dextran-mediated transfection, 

calcium phosphate precipitation, polylysine- or polyomithine-mediated transfection, or 
precipitation using other insoluble inorganic salts, such as strontium phosphate, 
aluminum silicates including bentonite and kaolin, chromic oxide, magnesium silicate, 
talc, and the like. Other useful methods of transfection include electroporation, 

10 sonoporation, protoplast fiision, liposomes, peptoid delivery, or microinjection. See, 
e.g., Sambrook et al., supra, for a discussion of techniques for transforming cells of 
interest; and Feigner, P.L., Advanced Drug Delivery Reviews (1990) 5: 163-1 87, for a 
review of delivery systems useful for gene transfer. One particularly effective method 
of delivering DNA using electroporation is described in International Publication No. 

15 WO/0045823. 

Additionally, biolistic delivery systems employing particulate carriers such as 
gold and tungsten, are especially useful for delivering the expression constructs of the 
present invention. The particles are coated with the construct to be delivered and 
accelerated to high velocity, generally under a reduced atmosphere, using a gun powder 

20 discharge from a "gene gun." For a description of such techniques, and apparatuses 
useful therefore, see, e.g., U.S. Patent Nos. 4,945,050; 5,036,006; 5,100,792; 
5,179,022; 5,371,015; and 5,478,744. 



Compositions 

25 The invention also provides compositions comprising the HCV polypeptides or 

polynucleotides described herein. Such compositions are useful as diagnostics, for 
example, using the mutant polypeptides (or polynucleotides encoding these 
polypeptides) in diagnostic reagents. Diagnostics using polypeptides and 
polynucleotides are known to those of skill in the art. 

30 In addition, immimogenic compounds can be prepared from one or more 

immunogenic polypeptides derived from the polypeptides described herein, for 
example the ANS35 polypeptide. The preparation of immunogenic compounds which 
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contain immunogenic polypeptide(s) as active ingredients is known to one skilled in the 
art. Typically, such immunogenic compounds are prepared as injectables,- either as 
liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, 
liquid prior to injection can also be prepared. The preparation can also be emulsified, or 
5 the protein encapsulated in liposomes. 

Immunogenic and diagnostic compositions of the invention preferably comprise 
a pharmaceutically acceptable carrier. The carrier should not itself induce the 
production of antibodies harmful to the host. Pharmaceutically acceptable carriers are 
well known to those in the art. Such carriers include, but are not limited to, large, 

10 slowly metabolized, macromolecules, such as proteins, polysaccharides such as latex 
functionalized sepharose, agarose, cellulose, cellulose beads and the like, polylactic 
acids, polyglycolic acids, polymeric amino acids such as polyglutamic acid, polylysine, 
and the like, amino acid copolymers, and inactive virus particles. 

Pharmaceutically acceptable salts can also be used in compositions of the 

15 invention, for example, mineral salts such as hydrochlorides, hydrobromides, 

phosphates, or sulfates, as well as salts of organic acids such as acetates, proprionates, 
malonates, or benzoates. Especially useful protein substrates are serum albumins, 
keyhole limpet hemocyanin, immunoglobulin molecules, thyroglobulin, ovalbumin, 
tetanus toxoid, and other proteins well known to those of skill in the art. Compositions 

20 of the invention can also contain liquids or excipients, such as water, saline, glycerol, 
dextrose, ethanol, or the like, singly or in combination, as well as substances such as 
wetting agents, emulsifying agents, or pH buffering agents. Liposomes can also be 
used as a carrier for a composition of the invention, such liposomes are described 
above. 

25 If desired, co-stimulatory molecules which improve immunogen presentation to 

lymphocytes, such as B7-1 or B7-2, or cytokines such as GM-CSF, IL-2, and IL-12, 
can be included in a composition of the invention. Optionally, adjuvants can also be 
included in a composition. Adjuvants which can be used include, but are not limited to: 
(1) aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, 

30 aluminum sulfate, etc; (2) oil-in-water emulsion formulations (with or without other 
specific immunostimulating agents such as muramyl peptides (see below) or bacterial 
cell wall components), such as for example (a) MF59 (PCT Publ. No. WO 90/14837), 
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containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing 
various amounts of MTP-PE ), formulated into submicron particles using a 
microfluidizer such as Model HOY microfluidizer (Microfluidics, Newton, MA), 
(b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer 
5 L121, and thr-MDP (see below) either micro fluidized into a submicron emulsion or 
vortexed to generate a larger particle size emulsion, and (c) Ribi^'^ adjuvant system 
(RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, 
and one or more bacterial cell wall components from the group consisting of 
monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton 

1 0 (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as Stimulon™ 
(Cambridge Bioscience, Worcester, MA) may be used or particles generated therefrom 
such as ISCOMs (immunostimulating complexes); (4) Complete Freund*s Adjuvant 
(CPA) and Incomplete Freund's Adjuvant (IFA); (5) cytokines, such as interleukins 
(e.g., IL-1, E--2, IL-4, IL-5, IL-6, IL-7, IL-12, etc), interferons (e.g., gamma 

1 5 interferon), macrophage colony stimulating factor (M-CSF), tumor necrosis factor 
(TNF), etc; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 
cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), 
particularly LT-K63, LT-R72, CT-S109, PT-K9/G129; see, e.g., WO 93/13302 and 
WO 92/19265; (7) other substances that act as immunostimulating agents to enhance 

20 the effectiveness of the composition; and (8) microparticles with adsorbed 

macromolecules, as described in copending U.S. Patent Application Serial No. 
09/285,855 (filed April 2, 1999) and international Patent Application Serial No. 
PCT/US99/17308 (filed July 29, 1999). Alum and MF59 are preferred. The 
effectiveness of an adjuvant can be determined by measuring the amount of antibodies 

25 directed against an immunogenic polypeptide containing an HCV antigenic sequence 
resulting from administration of this polypeptide in immunogenic compounds which 
are also comprised of the various adjuvants. 

As mentioned above, muramyl peptides include, but are not limited to, N- 
acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), -acetyl-normuramyl-L-alanyl- 

30 D-isoglutamine (CGP 1 1 637, referred to nor-MDP), N-acetybnuramyl-L-alanyl-D- 
isoglutaminyl-L-alanine-2-(r-2-dipalmitoyl-5/i-gIycero-3-hydroxyphosphoryloxy)- 
ethylamine (CGP 19835 A, referred to as MTP-PE), etc. 



^46- 



wo 01/38360 



PCT/USOO/32326 



Thus, such recombinant or synthetic HCV polypeptides can be used in vaccines 
and as diagnostics. Further, antibodies raised against these polypeptides can also be 
used as diagnostics, or for passive inununotherapy. In addition, antibodies to these 
polypeptides are useful for isolating and identifying HCV particles. 
5 Native HCV antigens can also be isolated from HCV virions. The virions can be 

grown in HCV infected cells in tissue culture, or in an infected host. 

Administration and Delivery 

The polynucleotide and polypeptide compositions described herein {e,g., 
10 immunogenic compounds) may be administered to a subject using any suitable delivery 
means. Methods of delivering nucleic acids into host cells are discussed above. 
Further, HCV polynucleotides and/or polypeptides can be administered parenterally, by 
injection, usually, subcutaneously, intramuscularly, transdermally or transcutaneously. 
Certain adjuvants, e.g. LTK63, LTR72 or PLC formulations, can be administered 
IS intranasally or orally. Additional formulations which are suitable for other modes of 
administration include suppositories. For suppositories, traditional binders and carriers 
can include, for example, polyalkylene glycols or triglycerides; such suppositories can 
be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, 
preferably l%-2%. Other oral formulations include such normally employed excipients 
20 as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These 
compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained 
release formulations or powders and contain 10%-95% of active ingredient, preferably 
25%-70%. 

25 The polypeptides of the present invention can be formulated into the 

immunogenic compound as neutral or salt forms. Pharmaceutically acceptable salts 
include the acid addition salts (formed with free amino groups of the peptide) and 
which are formed with inorganic acids such as, for example, hydrochloric or 
phosphoric acids, or such organic acids such as acetic, oxalic, tartaric, maleic, and the 

30 like. Salts formed with the free carboxyl groups can also be derived from inorganic 
bases such as, for example, sodium, potassium, ammonium, calcium, or ferric 
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hydroxides, and such organic bases as isopropylamine, trimethyiamine, 2*ethylamino 
ethanol, histidine, procaine, and the like. 

The immunogenic compounds are administered in a manner compatible with the 
dosage formulation, and in such amount as will be prophylactically and/or 

5 therapeutically effective. The quantity to be administered, which is generally in the 
range of 5 micrograms to 250 micrograms of polypeptide per dose, depends on the 
subject to be treated, capacity of the subject's immune system to synthesize antibodies, 
and the degree of protection desired. Precise amounts of active ingredient required to be 
administered may depend on the judgment of the practitioner and can be peculiar to 

10 each subject. 

The immunogenic compound can be given in a single dose schedule, or 
preferably in a multiple dose schedule. A multiple dose schedule is one in which a 
primary course of vaccination can be with 1-10 separate doses, followed by other doses 
given at subsequent time intervals required to maintain and or reenforce the immune 

IS response, for example, at 1-4 months for a second dose, and if needed, a subsequent 
dose(s) after several months. Further, the course of administration may include 
polynucleotides and polypeptides, together or sequentially (for example, priming with a 
polynucleotide composition and boosting with a polypeptide composition). The dosage 
regimen will also, at least in part, be determined by the need of the individual and be 

20 dependent upon the judgment of the practitioner. 

In certain embodiments, administration of the polynucleotides and polypeptides 
described herein is used to activate T cells. In addition to the practical advantages of 
simplicity of construction and modification, administration of polynucleotides encoding 
mutant NS polypeptides results in the synthesis of a mutant NS polypeptide in the host. 

25 Thus, these immunogens are presented to the host immune system with native post- 
translational modifications, structure, and conformation. The polynucleotides are 
preferably injected intramuscularly to a large mammal, such as a human, at a dose of 
0.5, 0.75, 1.0, 1.5, 2.0, 2.5, 5 or 10 mg/kg. 

The proteins and/or polynucleotides can be administered either to a mammal 

30 which is not infected with an HCV or can be administered to an HCV-infected 
mammal. The particular dosages of the polynucleotides or fusion proteins in a 
composition or will depend on many factors including, but not limited to the species, 



-48- 



wo 01/38360 



PCTAJSOO/32326 



age, and general condition of the mammal to which the composition is administered, 
and the mode of administration of the composition. An effective amoimt of the 
composition of the invention can be readily determined using only routine 
experimentation. In vitro and in vivo models can be employed to identify appropriate 
5 doses. Generally, 0.5, 0.75, 1 .0, 1 .5, 2.0, 2.5, 5 or 10 mg will be administered to a large 
mammal, such as a baboon, chimpanzee, or human. If desired, co-stimulatory 
molecules or adjuvants can also be provided before, after, or together with the 
compositions. 

1 0 Antibodies and Diagnostics 

Antibodies, both monoclonal and polyclonal, which are directed against HCV 
epitopes are particularly useful in diagnosis, and those which are neutralizing are useftil 
in passive immunotherapy. Monoclonal antibodies, in particular, may be used to raise 
anti-idiotype antibodies. 

IS Anti-idiotype antibodies are immunoglobulins which carry an "internal image" 

of the antigen of the infectious agent against which protection is desired. Techniques 
for raising anti-idiotype antibodies are known in the art. See, e.g., Grzych (198S), 
Nature 316:74; MacNamara et al. (1984), Science 226:1325, Uytdehaag et al (1985), J. 
Immunol. 134:1225. These anti-idiotype antibodies may also be useful for treatment 

20 and/or diagnosis of NANBH, as well as for an elucidation of the immunogenic regions 
of HCV antigens. 

An immimoassay for viral antigen may use, for example, a monoclonal antibody 
directed towards a viral epitope, a combination of monoclonal antibodies directed 
towards epitopes of one viral polypeptide, monoclonal antibodies directed towards 

25 epitopes of different viral polypeptides, polyclonal antibodies directed towards the 

same viral antigen, polyclonal antibodies directed towards different viral antigens or a 
combination of monoclonal and polyclonal antibodies. 

Inmiimoassay protocols may be based, for example, upon competition, or direct 
reaction, or sandwich type assays. Protocols may also, for example, use solid supports, 

30 or may be by immunoprecipitation. Most assays involve the use of labeled antibody or 
polypeptide. The labels may be, for example, fluorescent, chemiluminescent, 
radioactive, or dye molecules. Assays which amplify the signals from the probe are also 
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known. Examples of which are assays which utilize biotin and avidin, and enzyme- 
labeled and mediated immunoassays, such as ELISA assays. 

An enzyme-linked immunosorbent assay (ELISA) can be used to measure either 
antigen or antibody concentrations. This method depends upon conjugation of an 

5 enzyme to either an antigen or an antibody, and uses the bound enzyme activity as a 
quantitative label. To measure antibody, the known antigen is fixed to a solid phase 
(e.g., a microplate or plastic cup), incubated with test serum dilutions, washed, 
incubated with anti-inununoglobulin labeled with an enzyme, and washed again. 
Enzymes suitable for labeling are known in the art, and include, for example, 

10 horseradish peroxidase. Enzyme activity bound to the solid phase is measured by 

adding the specific substrate, and determining product formation or substrate utilization 
colorimetrically. The enzyme activity bound is a direct fimction of the amount of 
antibody bound. 

To measure antigen, a known specific antibody is fixed to the solid phase, the 
IS test material containing antigen is added, after an incubation the solid phase is washed, 
and a second enzyme-labeled antibody is added. After washing, substrate is added, and 
enzyme activity is estimated colorimetrically, and related to antigen concentration. 

The HCV ftision proteins, such as NS3 mutant and core fusion proteins, can 
also be used to produce HCV-specific polyclonal and monoclonal antibodies. HCV- 
20 specific polyclonal and monoclonal antibodies specifically bind to HCV antigens. 

Polyclonal antibodies can be produced by administering the fusion protein to a 
mammal, such as a mouse, a rabbit, a goat, or a horse. Serum firom the inununized 
animal is collected and the antibodies are purified fi'om the plasma by, for example, 
precipitation with ammonium sulfate, followed by chromatography, preferably affinity 
25 chromatography. Techniques for producing and processing polyclonal antisera are 
known in the art. 

Monoclonal antibodies directed against HCV-specific epitopes present in the 
fusion proteins can also be readily produced. Normal B cells fi'om a mammal, such as a 
mouse, inmiunized with, e.g., a mutant NS3 polypeptide or NS-core fiision protein can 
30 be fiised with, for example, HAT-sensitive mouse myeloma cells to produce 

hybridomas. Hybridomas producing HCV-specific antibodies can be identified using 
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RI A or ELIS A and isolated by cloning in semi-solid agar or by limiting dilution. 
Clones producing HCV-specific antibodies are isolated by another round of screening. 

Antibodies, either monoclonal and polyclonal, which are directed against HCV 
epitopes, are particularly useful for detecting the presence of HCV or HCV antigens in 
5 a sample, such as a serum sample from an HCV-infected human. An immunoassay for 
an HCV antigen may utilize one antibody or several antibodies. An immunoassay for 
an HCV antigen may use, for example, a monoclonal antibody directed towards an 
HCV epitope, a combination of monoclonal antibodies directed towards epitopes of one 
HCV polypeptide, monoclonal antibodies directed towards epitopes of different HCV 

10 polypeptides, polyclonal antibodies directed towards the same HCV antigen, polyclonal 
antibodies directed towards different HCV antigens, or a combination of monoclonal 
and polyclonal antibodies. Immunoassay protocols may be based, for example, upon 
competition, direct reaction, or sandwich type assays using, for example, labeled 
antibody. The labels may be, for example, fluorescent, chemiluminescent, or 

IS radioactive. 

The polyclonal or monoclonal antibodies may further be used to isolate HCV 
particles or antigens by immunoafKnity columns. The antibodies can be affixed to a 
solid support by, for example, adsorption or by covalent linkage so that the antibodies 
retain their immunoselective activity. Optionally, spacer groups may be included so 
20 that the antigen binding site of the antibody remains accessible. The immobilized 

antibodies can then be used to bind HCV particles or antigens from a biological sample, 
such as blood or plasma. The bound HCV particles or antigens are recovered from the 
column matrix by, for example, a change in pH. 

25 Methods of Eliciting Immune Responses 

HCV-specific T cells that are activated by the above-described polypeptides, 
expressed in vivo or in vitro preferably recognize an epitope of an HCV polypeptide 
such as a mutant NS3 polypeptide, including an epitope of a mutant HCV polypeptide. 
HCV-specific T cells can be CD8^ or CD4*. 

30 HCV-specific CD8* T cells preferably are cytotoxic T lymphocytes (CTL) 

which can kill HCV-infected cells that display NS3, NS4, NSSa, NS5b epitopes 
complexed with an MHC class I molecule. HCV-specific CD8^ T cells may also 
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express interferon-Y OFN-y). HCV-specific CDS* T cells can be detected by, for 
example, ^'Cr release assays. ^'Cr release assays measure the ability of HCV-specific 
CDS"^ T cells to lyse target cells displaying an nonstructural {e.g., mutant NS) epitope. 
HCV-specific CD8* T cells which express IFN-y can also be detected by 
5 immunological methods, preferably by intracellular staining for EFN-y after in vitro 
stimulation with a mutant NS polypeptide. 

HCV-specific CD4* cells activated by the above-described polypeptides, 
expressed in vivo or in vitrOy and combinations of the individual components of these 
proteins, preferably recognize an epitope of a mutant non-structural polypeptide, 

10 including an epitope of a mutant protein, that is bound to an MHC class II molecule on 
an HCV-infected cell and proliferate in response to stimulating mutant peptides. 

HCV-specific CD4* T cells can be detected by a lymphoproliferation assay. 
Lymphoproliferation assays measure the ability of HCV-specific CD4^ T cells to 
proliferate in response to an epitope. 

1 5 Mutant NS (or fiisions thereof with core, envelope or other viral polypeptides) 

can be used to activate HCV-specific T cells either in vitro or in vivo. Activation of 
HCV-specific T cells can be used, inter alia^ to provide model systems to optimize 
CTL responses to HCV and to provide prophylactic or therapeutic treatment against 
HCV infection. For in vitro activation, proteins are preferably supplied to T cells via a 

20 plasmid or a viral vector, such as an adenovirus vector, as described above. 

Polyclonal populations of T cells can be derived fi-om the blood, and preferably 
fi-om peripheral lymphoid organs, such as lymph nodes, spleen, or thymus, of manunals 
that have been infected with an HCV. Preferred mammals include mice, chimpanzees, 
baboons, and humans. The HCV serves to expand the number of activated HCV- 

25 specific T cells in the mammal. The HCV-specific T cells derived fi'om the manmial 
can then be restimulated in vitro by adding HCV epitopic peptides to the T cells. The 
HCV-specific T cells can then be tested for, inter alia, proliferation (e.g,. 
lymphoproliferation assays known in the art), the production of IFN-y, and the ability 
to lyse target cells displaying HCV NS epitopes in vitro. 

30 
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The following examples are meant to illustrate the invention and are not meant 
to limit it in any way. Those of ordinary skill in the art will recognize modifications 
within the spirit and scope of the invention as set forth herein, 

5 EXAMPLES 
Example 1: Constructs 

pCMV-II : pCMV-U (Figure 7, SEQ ID NO:5) was created to contain the human 
CMV promoter, enhancer, intron A, polylinker and the bovine growth hormone 

10 terminator in a deleted-pUC backbone (Life Technologies). 

pT7-HCV : pT7-HCV was created in a polylinker-modified pUC vector to 
contain full-length HCV cDNA preceded by a synthetic T7 promoter. pT7-HCV also 
contains the complete 5' UTR and the poly A version of the 3' UTR. 

pCMV.ANS35 : To generate pCMV.ANS35 (Figure 5, SEQ ID N0:3), a two 

15 step procedure was undertaken. First, a PCR product was generated from pT7-HCV 
that corresponded to the following: a 5' EcoRI site, followed by the Kozak sequence of 
ACCATGG; the initiator ATG followed by amino acid #1242 and continuing to the 
StuI site. Second, the StuI to Xbal fragment from a full-length genomic clone was 
isolated. The genomic clone consisted of the T7 promoter fused to the full-length HCV 

20 cDNA with the poly A version of the 3' end, in a pUC vector. Finally, the EcoRI-StuI 
and Stul-Xbal fragments were ligated into the pCMV-II expression vector, transformed 
into HBlOl competent cells and plated onto ampicillin (100 ^g/ml). Miniprep analyses 
led to the identification of the desired clone which was amplified on a larger scale using 
a Quigen Gigaprep kit following the manufacturer's specifications. The resulting clone 

25 was named pCMV.ANS35 (Figure 5, SEQ ID NO:3). 

pd.ANS3NS5 : As shown schematically in Figure 10, the yeast expression 
plasmid pd.ANS3NS5 (SEQ ID N0:8) was constructed using restriction fragments 
obtained from the mammalian expression plasmid pCMV.KM.ANS35. 
pCMV.KM.ANS35 is identical to pCMV.ANS35 (Figure 5, SEQ ID N0:3) except diat 

30 it contains a kanamycin resistance gene in the viral backbone. pCMV.KM.ANS3S was 
digested with EcoRI and Nhel to obtain 289Sbp EcoRI-Nhel firagment. EcoRI-Nhel 
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fragment was ligated into pRSET Hindlll-Nhel subcloning vector with oligos (HE) 
from Hindni to EcoRI. After sequence verification, pRSETHindOI-Nhel #6 was 
digested with HindlD and Nhel to obtain a 2908bp HindlU-Nhel fragment. 

pCMV.KM.ANS35 was linearized with Xbal and ligated with synthetic oligos 

5 (XS) from Xbal-Sall. The ligation was digested with Nhel and Sail to obtain 248 Ibp 
Nhel-Sall fragment. The fragment was ligated into pET3a Nhel-Sall subcloning 
vector. After sequence verification, pET3a Nhel-Sall #2 was digested with Nhel and 
Sail to obtain a 248 Ibp Nhel-Sall fragment. BamHI-HindlU ADH2/GAPDH promoter 
fragment was then ligated with Hindlll-Nhel and Nhel-Sall fragments into pBS24.1 

10 BamHI-Sall yeast expression vector. 

pd.ANS3NSS.PJ: pd.ANS3NS5.PJ (Figures 13 and 14; SEQ ID NO:10) was 
generated to create a "perfect junction" at the 5' and 3' end of the HCV coding region. 
At the 5' end of pd.ANS3NS5, there were 6 extra bases between the yeast 
ADH2/GAPDH promoter and the ATG of the polypeptide. At the 3* end, there were 52 

15 bases of untranslated sequence between the stop codon of the polypeptide and the a- 
factor terminator in the yeast expression vector. pd.ANS3NS5.PJ was created by 
digesting pd.ANS3NS5 #17 with Seal and SphI to obtain 4963bp Scal-SphI fragment. 
pd.NS5b301 1 was digested with SphI and Sail to obtain a 32 Ibp Sphl-Sall fragment 
which gave the "perfect junction'* at the 3' end of the polypeptide. The Scal-SphI and 

20 Sphl-Sall fragments were ligated into pSP72 Hindlll-Sall subcloning vector with 
synthetic oligos from HindIII-ScaI(HS) for the *'perfect junction" at the 5* end. 

The region of synthetic sequence in pSP72 Hmdlll-Sall clone# 6 was verified. 
pSP72 Hindlll-Sall clone#6 was digested with Hindffl and Bhil or with Blnl and Sail 
to obtain 2441bp Hindm-Bhil and 2895bp Bhil-Sall firagments, respectively. The 

25 BamHI-Hindffl ADH2/GAPDH promoter fragment was ligated to HindlH-Bhil and 
Blnl-Sall fragments into pBS24.1 BamHI-Sall yeast expression vector. 

pd,ANS3NS5.PJ.corel21RT and pd.A NS3NSS.PJ.corel73RT were generated 
and encode HCV core aa 1-121 at the C-terminus of the ANS3NS5 polypeptide 
(designated pd.ANS3NS5.PJ.corel21RT, SEQ ID N0:12) and core aa 1-173 at the C- 

30 terminus of the ANS3NS5 polypeptide (designated pd.ANS3NS5.PJ.corel73RT, SEQ 
ID NO: 14). The core sequence had aa 9 mutated from Lys to Arg and aa 1 1 mutated 
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from Asn to Thr, designated as core 121RT or 173RT. 

DdANS3NS5,PJ.corel21RT and pd,ANS3NS5.PJ.corel73RT : To generate 
pd.ANS3NS5.PJxorel21RT (Figure 17, SEQ ID N0:12) and 
pd.ANS3NS5.PJ.corel73RT (Figure 18, SEQ ID N0:14). As shown in Figure 16, a 
5 Notl-Sal HCVcorel21RT and HCVcorel73RT were amplified by PGR, from an £. coli 
expression plasmid, pSODCF2.HCVcorel91RT #2. Either the core 121RT Not-Sall 
PGR product or the core 173RT Not-Sall PGR product were ligated into a pT7Blue2 
Pstl-Sall subcloning vector with synthetic oligos (PN) from PstI to Notl. After 
sequence confirmation, pT7Blue2corel21RT clone#9 and pT7Blue2corel73RT 

1 0 clone#l 1 was digested with PstI and Sail to obtain 403bp and SS9bp Pstl-Sall 
fragments, respectively, for frirther cloning. 

A 121bp Notl-PstI fragment from pSP72 HindDI-Sall clone #6 was isolated as 
described above during the cloning of pd.ANS3NS5,PJ. Notl-PstI and Pstl-Sall 
fragments were assembled into a vector made by digesting pd.NS3NS5.PJ clone#5 

1 5 (described above) with NotI and Sail. 

ANS3NS5 and Core 140 and Core 150 : An HCV core epitope was found which 
elicits CTLs in baboons (HCV core aa 121-135). Since pd.ANS3NS5.PJ.corel21RT 
ends right before this potentially important epitope and was expressed better than the 
longer pd.ANS3NS5.PJ.corel73RT construct (Example 2), two intermediate constructs 

20 were made which include this epitope, possibly giving intermediate expression levels. 
The two new constructs fused HGV core aa 1-140 or HCV core aal -1 50 to the G 
terminus of ANS3NS5.PJ. 

Dd.ANS3NSS.PJ.corel40RT (Figure 21, SEP ID NO:16^ and 
pd. ANS3NSS.PJ,corel SORT (Figure 22, SEQ ID NO: 18): As shown in Figure 20, a 

25 Pstl-Sall HGVcorel40RT and a PstI-SalIHGVcorel50RT fragment were amplified by 
PGR from pd.ANS3NS5.PJ.corel73RT clone #16. Ligate either HCV core PsU-Sall 
PGR products into pT7Blue2 Pstl-Sall subcloning vector. After sequence 
confirmation, pT7Blue2corel40RT clone#22 and pT7Blue2corel50RT clone#26 were 
digested with Pstl-Sall to obtain 460bp and 490bp Pstl-Sall fragments, respectively, for 

30 fiirther cloning. 
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A 121bp Notl-PstI fragment was isolated from pSP72 Hindlll-Sall clone #6 (as 
described above during the cloning of pd.ANS3NS5.PJ. Notl-PstI and Pstl-Sall 
fragments were assembled into a vector made by digesting pd.ANS3NS5.PJ clone#5 
(described above) with NotI and Sail. 

5 

Example 2: Protein Expression 

Various of the constructs described herein, encoding HCV-1 ANS3 to NS5 
antigen (aa 1242-301 1), were expressed in yeast. S. cerevisiae strain AD3 was 
transformed with pd.ANS3NSS and checked for expression. A stained protein band at 

10 the expected molecular weight of 194 kD was not observed (Figure 12). Strain AD3 
was also transformed with pd.ANS3NSS.PJ clone #S and checked for expression. A 
protein band of the expected molecular weight of 194kD was detected (Figure 15). 

Strain AD3 was transformed with pd,ANS3NS5.PJ.corel21RT clone #6 and 
pd.ANS3NS5.PJxorel73RT clone#15 and checked for expression. Protein bands of the 

1 5 expected molecular weight of 206kD and 2 1 OkD, respectively, were observed. 

Expression levels of the pd.ANS3NSS.PJ.corel73RT construct were much less than 
that of the pd.ANS3NS5.PJ.corel21RT construct. (See Figurel9). Thus, there is a 
correlation of protein expression levels and the length of HCV core. 

Strain AD3 were transformed with pd.ANS3NS5.PJ.corel40RT clone# 29 and 

20 pd. ANS3NS5.PJ.corel SORT clone#35 and checked for expression. Bands of the 

expected molecular weights of 208kD and 209kD were seen by stain at levels close to 
those of pd.ANS3NS5corel73RT (Figure 23). 

Example 3: Eliciting Immune Responses 

25 A. Immunization 

To evaluate the immunogenicity of the mutant NS polypeptides, studies using 
guinea pigs, rabbits, mice, rhesus macaques and/or baboons are performed. The studies 
are structured as follows: DNA inmiunization alone (single or multiple); DNA 
immunization followed by protein immunization (boost); DNA immunization followed 

30 by protein immunization; immunization by PLG particles. Immunization is 
intramuscular or mucosally. 
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B. Humoral Immune Response 

The humoral immune response is checked in serum specimens from immunized 
animals with anti-NS antibody ELISAs (enzyme-linked inmiunosorbent assays) at 

5 various times post-immunization. Briefly, serum from immunized animals is screened 
for antibodies directed against the NS or mutant NS proteins. Wells of ELISA 
microtiter plates are coated overnight with the selected HCV protein and washed four 
times; subsequently, blocking is done with PBS-0.2% Tween (Sigma). After removal 
of the blocking solution, diluted mouse serum is added. Sera are tested at various 

10 dilutions. Microtiter plates are washed and incubated with a secondary, peroxidase- 

coupled anti-mouse IgG antibody (Pierce, Rockford, IL). ELISA plates are washed and 
3, 3*, 5, 5"-tetramethyl benzidine (TMB; Pierce) is added per well. The optical density 
of each well is measured. Titers are typically reported as the reciprocal of the dilution 
of serum that gave a half-maximum optical density (O.D.). Similarly, generation of 

1 S neutralization of binding (NOB) antibodies can be measured by methods known in the 
art. 

C. Cellular Immune Response 

The frequency of specific cytotoxic T-lymphocytes (CTL) is evaluated by a 
20 standard chromium release assay of peptide pulsed Balb/c mouse CD4 cells. Briefly, 
spleen cells (Effector cells, E) are obtained from the BALB/c mice immunized, 
cultured, restimulated, and assayed for CTL activity against HCV peptide-pulsed target 
cells. Cytotoxic activity is measured in a standard **Cr release assay. 

2S Example 4: Immunization with PLG-delivered DNA. 

The polylactide-co-glycolide (PLG) polymers are obtained from Boehringer 
Ingelheim, U.S.A. The PLG polymer is RGS05, which has a copolymer ratio of SO/SO 
and a molecular weight of 65 kDa (manufacturers data). Cationic microparticles with 
adsorbed DNA are prepared using a modified solvent evaporation process, essentially 
30 as described in Singh et al., Proc, Natl Acad. Set USA (2000) 97:81 1-816. Briefly, the 
microparticles are prepared by emulsifying a 5% w/v polymer solution in methylene 
chloride with PBS at high speed using an IKA homogenizer. The primary emulsion is 
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then added to distilled water containing cetyl trimethyl ammonium bromide (CTAB) 
(0,5% w/v). This results in the fonnation of a w/o/w emulsion which was stirred at 
room temperature, allowing the methylene chloride to evaporate. The resulting 
microparticles are washed in distilled water by centrifugation and freeze dried. 
Following preparation, washing and collection, DNA is adsorbed onto the 
microparticles by incubating cationic microparticles in a solution of DNA. The 
microparticles are then separated by centrifugation, the pellet washed with TE buffer 
and the microparticles are freeze dried, resuspended and administered to animals. 
Antibody titers are measured by ELIS A assays. 
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What is claimed is: 

1 . An isolated mutant non-stmctural ("NS") HCV polypeptide comprising 
a polypeptide having a mutation in the catalytic domain of NS3, wherein said mutation 

5 functionally disrupts the catalytic domain. 

2. The polypeptide of claim 1, wherein the mutation comprises a deletion. 

3. The polypeptide of claim U wherein the mutation comprises a 
10 substitution. 

4. The polypeptide of any of claims 1-3, wherein said NS polypeptide 
comprises NS3, NS4 and NS5. 

15 5. The polypeptide of any of claims 1-3, wherem said NS polypeptide 

consists of NS3, NS4 and NS5. 

6. The polypeptide of any of claims 1-3, wherein said NS polypeptide 
consists of NS3 and NS5. 

20 

7. The polypeptide of claim 6, wherein NS5 consists of NS5a. 

8. The polypeptide of claim 6, wherein NS5 consists of NS5b. 

25 9. The polypeptide of any of claims 1 -3, wherein said NS polypeptide 

consists of NS3 and NS4. 

10. The polypeptide of claim 9, wherein NS4 consists of NS4a. 

30 11. The polypeptide of claim 9, wherein NS4 consists of NS4b. 
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1 2. The polypeptide of claim 4, fiirther comprising a second viral 
polypeptide that is not NS3, NS4, or NS5 of HCV. 

13. The polypeptide of claim 12, wherein the second viral polypeptide 
5 comprises an HCV Core polypeptide ("C"), or fragment thereof. 

14. The polypeptide of claim 13, wherein the C polypeptide is truncated. 

15. The polypeptide of claim 14, wherein the truncation is at amino acid 
10 121. 

16. The polypeptide of claim 12, wherein the polypeptide further comprises 
an HCV envelope protein ("E"). 

15 17. The polypeptide of claim 1 6, wherein the E is El . 

18. The polypeptide of claim 16, wherein the E is E2. 

19. A composition comprising 

20 (a) the polypeptide of any one of claims 1-18; and 

(b) a pharmaceutically acceptable excipient. 

20. An isolated and purified polynucleotide which encodes the mutant HCV 
polypeptide according to any one of claims 1-18. 

25 

21 . A composition comprising 

(a) the isolated purified polynucleotide of claim 20; and 

(b) a pharmaceutically acceptable excipient. 

30 22. The composition of claim 21, wherein the polynucleotide is DNA. 
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23. The composition of claim 21 , wherein the polynucleotide is in a 
plasmid. 

24. An expression vector comprising the polynucleotide of claim 20. 

5 

25 . An expression vector comprising the polynucleotide of SEQ ID N0:8. 

26. A host cell comprising the polynucleotide of claim 20. 
10 27. The host cell of claim 26, wherein the cell is a yeast cell. 

28. The host cell of claim 26, wherein the cell is a mammalian cell. 

29. The host cell of claim 26, wherein the cell is an insect cell. 

15 

30. The host cell of claim 26, wherein the cell is a plant cell. 

3 1 . The host cell of claim 26, wherein the polynucleotide comprises the 
sequence of SEQ ID NO:8. 

20 

32. The polypeptide of claim 1 , wherein the polypeptide further comprises 
SEQIDN0:9. 

33. A method of preparing a mutant NS HCV polypeptide, wherein the 
25 method comprises the steps of: 

a. transforming a host cell with an expression vector according to 
claim 24, under conditions wherein the polypeptide is expressed; 
and 



30 



b. isolating the polypeptide. 
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34. The method of claim 33, wherein the host cell is a yeast cell. 

35. The method of claim 33, wherein the host cell is a mammalian cell. 
S 36. The method of claim 33, wherein the host cell is an insect cell. 

37. The method of claim 33, wherein the host cell is a plant cell. 

38. An antibody that specifically binds to a polypeptide of any of claims 1- 

10 18. 

39. The antibody of claim 38, wherein the antibody is a monoclonal 
antibody. 

1 5 40. The antibody of claim 38, wherein the antibody is a purified polyclonal 

antibody. 

41 . A method of eliciting an immune response in a subject, comprising the 
step of administering to the subject a polypeptide of any of claims 1-18. 

20 

42. A method of eliciting an immune response in a subject, comprising the 
step of administering to the subject a polynucleotide of claim 20. 
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1 


TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA CAGCTTGTCT GTAAGCGGAT 
AGCOTGCAAA GCCACTACTG CCACTTTTGG AGACTGTGTA CGTCGAGGGC CTCTGCCAGT GTCCAACAGA CATTCGCCTA 


81 


GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGQGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA 
CGGCCCTCGT CTGTTCGGQC AGTCCXGCGC AGTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC GCCGTAGTCT 


161 


StuE 

GCAGATTGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG 
CGTCTAACAT GACTCTCACG TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GATGAAGACC 


241 


AATAGCTCAG AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG AATGGGC3GA 
TTA7CGACTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT ACCCCGCCTC TTACCs^^^CT 


321 


ACTGGGC5GG GAGGGAATTA TTGGCTATTG GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT TATATTGGCT 
TGACCCGCCC CTCCCTTAAT AACCGATAAC CGGTAACGTA TGCAACATAG ATATAGTATT ATACAaCTAA ATATAACC^A 


401 


CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAfiTTATTAA TAGTAATCAA TTACGGGGTC ATTAGTTCAT 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT ATCATTAGTT AATGCCCCAG TAATCAAGTA 


481 


AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CTCAACGACC CCCGCCCATT 
TCGGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG GGGCGGGTAA 


561 


GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA MGGGTGGAG TATTTACGGT 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCACT TACCCACCTC ATAAATGCCA 


641 


AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG TCAATGAC^ IJ^Irrrr- 
TTTGACGGGT GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC AGTTACTGCC ATTTACCGGG 


721 


GCCTGGCATT ATGCCCAGTA CATGACCTTA CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA JCGCTATTAC 
CGGACCGTAA TACGGGTCAT GTACTGGAAT GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT A^CGATAA.G 


801 


CATGGTGATG CGGTTTTGGC AGTACACCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAAGT CTCCACCCCA 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA GAGGTG^Go* 


881 


TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA AJAACC^GC CCCGTTGACG 
AACTGCAGTT ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TATTGGQGCG GGGCAAC.G^ 


961 


CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTCG TTTACTGAAC CGTCAGATCG CCTGGAGACG 
GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGAGC AAATCACTTG GCACTCTACC GGACClCTG.. 


1041 


rrftTrrACGC tgtTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG GGAACGGTGC ATTGGAACGC 
rSIrrrrS IrlillcTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC CCTTGCCACG TAACCTTGCG 


1121 


GCATTCCCCG TGCCAAGAGT GACGTAAGTA CCGCCTATAG ACTCTATAGC CACACCCCTT TGGCTCTTAT GCATGCTATA 
a?M6^c I^cVEa ctSattSt GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGACAATA CGTACGATAT 


1201 


CTGTTTtTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA t«TATACCT TA^^JAG GTGTGGGTTA 
GACAAAAACC GAACCCCGGA TAIGTGGGGG CGAGGAATAC GATATCCACT ACCATATCGA AiraSATATC CACACCCAAT 


1281 


TTrarCATTA TTGACCACTC CCCTATTCGT GACGATACTT tccattacta ATCCATAACA TGGCTCTTTG ccacaactat 
mctgctS? SaI^ Sgctatgaa AGOTAATGAT TAGGTATTGT accgacaaac GGTGTTGATA 


1361 


rTCTMTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT GACACGGACT ctgtaitttt acaggatggg gtccatttat 

gagmII^ I^SSSgg aagtctctga cigtgcctga gacataaaaa tgtcctaccc caggtaaa.a 
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TATTTACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT 
ATAAATGTTT AAGTGTATAT GTTGTTCCGG CAGGGGGCAC GGGCGTCAAA 


TTATTAAACA 
AATAATTTGT 


TAGCGTGGGA 
ATCGCACCCT 


TCTCCGACAT 
AGAGGCTGTA 


1521 


CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCCGTAGC 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA CAGGCCATCG 


GGCGGAGCTT 
CCGCCTCGAA 


CCACATCCGA 
GGTGTAGGCT 


GCCCTGGTCC 
CGGGACCAGG 


CATCCGTCCA 
GTAGGCAGGT 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT 
CGCCGACTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA 


GGAGGCCAGA 
CCTCCGGTCT 


CTTAGGCACA 


GCACAATGCC 


CACCACCACC 
GTGGTGGTGG 


1681 


AGTGTGCCGC 
TCACACGGCG 


ACAAGGCCGT GGCGGTAGGG TATGTGTCTG 
TGTTCCGGCA CCCCCATCCC ATACACAGAC 


AAAATGAGCT 
TTTTACTCGA 


CGGAGATTGG 
GCCTCTAACC 


GCTCGCACCT 
CGAGCGTGGA 


GGACGCAGAT 
CCTGCG7CTA 


1761 


GGAAGACTTA 
CCTTCTGAAT 


AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT 
TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA 


GACTTGTTGT 
CTCAACAACA 


ATTCTGATAA 
TAAGACTATT 


GAGTCAGAGG 
CTCAGTCTCC 


TAACTCCCGT 
ATTGAGGGCA 


1841 


TGCGGTGCTG 
ACGCCACGAC 


TTAACGGTGG AGGGCAGTGT AGTCTGAGCA 
AATTGCCACC TCCCGTCACA TCAGACTCGT 


GTACTCGTTG 
CATGAGCAAC 


CTGCCGCGCG 
GACGGCGCGC 


CGCCACCAGA 
GCGGTGGTCT 


CATAATAGCT 
GTATTATCGA 


♦ 2 










ECORI 


M A A 


1921 


GACAGACTAA 
CTGTCTGArT 


CAGACTGTTC CTTTCCATGG GTCTTTTCTG 
GTCTGACAAG GAAAGGTACC CAGAAAAGAC 


CAGTCACCGT 
GTCAGTGGCA 


CGTCGACCTA 
GCAGCTGGAT 


AGAATTCACC 
TCTTAAGTGG 


ATGGCTGCAT 
TACCGACGTA 


+ 2 
2001 


Y A A Q 
ATGCAGCTCA 
TACGTCGAGT 


GYK VLVL NPS 
GGGCTATAAG GTGCTAGTAC TCAACCCCTC 
CCCGATATTC CACGATCATG AGTTGGGGAG 


V A A 
TGTTGCTGCA 
ACAACGACGT 


T L G r GAY 
ACACTGGGCT TTGGTGCTTA 
TGTGACCCGA AACCACGAAT 


M S K 
CATGTCCAAG 
GTACAGGTTC 



♦2AHGI DPN tRT GVRT ITT GSP ITYS TYG 
2081 GCTCATGGGA TCGATCCTAA CATCAGGACC GGGGTGAGAA CAATTACCAC TGGCAGCCCC ATCACGTACT CCACCTACGG 
CGAGTACCCT AGCTAGGATT GTAGTCCTGG CCCCACTCTT GTTAATGGTG ACCGTCGGGG TAGTGCATGA GGTGGATGCC 



KFL ADGG CSG GAY Dili CDE CHS IDA 
2161 CAAGTTCCTT GCCCACGGCC GGTGCTCGGG GGGCGCTTAT GACATAATAA TTTGTGACGA GTGCCACTCC ACGGATGCCA 
GTTCAAGGAA CGGCTGCCGC CCACGAGCCC CCCGCGAATA CTGTATTATT AAACACTGCT CACGGTGAGG TGCCTACGGT 



+ 2TSIL GIG TVLD QAE TAG ARLV VLA TAT 
2241 CATCCATCTT GGGCATTGGC ACTGTCCTTG ACCAAGCAGA GACTGCGGGG GCGAGACTGG TTGTGCTCGC CACCGCCACC 
GTAGGTAGAA CCCGTAACCG TGACAGGAAC TGGTTCGTCT CTGACGCCCC CGCTCTGACC AACACGAGCG GTGGCGGTGG 



i'2PPGS VTV PHP NICE VA L STT GEIP FYG 
2321 CCTCCGGGCT CCGTCACTGT CCCCCATCCC AACATCGAGG AGGTTGCTCT GTCCACCACC GGAGAGATCC CTTTTTACGG 
GGAGGCCCGA GGCAGTGACA CGGGGTAGGG TTGTACCTCC TCCAACGAGA CAGGTGGTGG CCTCTCTAGG GAAAAATGCC 



*2 KAI PLEV IKG GRH LIFC HSK KKC DEL 
2401 CAAGGCTATC CCCCTCGAAG TAATCAAGGG GGGGAGACAT CTCATCTTCT GTCATTCAAA GAAGAAGTGC GACGAACTCG 
GTTCCGATAG GGGGAGCTTC ATTAGTTCCC CCCCTCTGTA GAGTAGAAGA CAGTAAGTTT CTTCTTCACG CTGCTTGAGC 



f2AAKL VAL GINA VAY YRG LDVS VIP TSG 
2481 CCGCAAAGCT GGTCGCATTG GGCATCAATG CCG7GGCCTA CTACCGCGGT CTTGACGTGT CCGTCATCCC GACCAGCGGC 
CGCGTTTCGA CCAGCGTAAC CCGTAGTTAC GGCACCGGAT GATGGCGCCA GAACTGCACA GGCAGTAGGG CTGGTCGCCG 



+2 0VVV VAT DA L MTGY TGD FDS VIDC NTC 
2S61 GATGTTGTCG TCGTGGCAAC CCATGCCCTC ATGACCGGCT ATACCGGCGA CTTCGACTCG GTGATAGACT. GCAATACGTG 
CTACAACAGC AGCACCGTTG GCTACGGGAG TACTGGCCGA TATGGCCGCT GAAGCTGAGC CACTATCTCA CGTTATGCAC 



4/100 



wo 01/38360 



PCT/USOO/32326 



PCMV-NS35 

FIGURE 3 -Page 3 

+2 VTQ TVOF SLD PTF TIET ITL PQD AVS 
2641 . TGTCACCCAG ACAGTCGATT TCAGCCTTGA CCCTACCTTC ACCATTGAGA CAATCACGCT CCCCCAAGAT GCTGTCTCCC 
ACAGTGGGTC TGTCAGCTAA ACTCGGAACT GGGATGGAAG TGGTAACTCT GTTAGTGCGA GGGGGTTCTA CGACAGAGGG 



*2RT0R RGR TGRG KPG lYR FVAP GER PSG 
2121 CCACTCAACC TCGGGGCAGG ACTGGCAGGG GGAAGCCAGG CATCTACAGA TTTGTGGCAC CGGGGGAGCG CCCCTCCGGC 
CGTGAGTTGC AGCCCCGTCC TGACCGTCCC CCTTCGGTCC GTAGATGTCT AAACACCGTG GCCCCCTCGC GGGGAGGCCG 

f2MrOS SVL CEC YDAG CAW YEL TPAE TTV 
2801 ATGT7CGACT CGTCCGTCCT CTGTGAGTGC TATGACGCAG GCTGTGCTTG 3TATGAGCTC ACGCCCGCCG AGACTACAGT 
TACAAGCTGA GCAGGCAGGA GACACTCACG ATACTGCGTC CGACACGAAC CATACTCGAG TGCGGGCGGC TCTGATGTCA 

♦ 2 RLR AVMN TPG LPV CQDH .LEF WEG VFT 

StuI 

2881 TAGGCTACGA GCGTACATGA ACACCCCGGG GCTTCCCGTG TGCCAGGACC ATCTT6AATT TTGGGAGGGC GTCTTTACAG 
ATCCGATGCT CGCATGTACT TGTGGGGCCC CGAAGGGCAC ACGGTCCTGG TAGAACTTAA AACCCTCCCG CAGAAATGTC 

4.2GLTH IDA HFLS QTK QSG EMLP YLV AYQ 

stur 

2961 GCCTCACTCA TATAGATGCC CACTTTCTAT CCCAGACAAA GCAGAGTGGG GAGAACCTTC CTTACCTGGT AGCGTACCAA 
CGGAGTGAGT ATATCTACGG GTGAAAGATA GGGTCTGTTT CGTCTCACCC CTCTTGGAAG GAATGGACCA TCGCATGGTT 



*2ATVC ARA QAP PPSW DQM WKC LIRL KPT 
3041 GCCACCGTGT GCGCTAGGGC TCAAGCCCCT CCCCCATCGT GGGACCAGAT GT6GAAGTGT TTGATTCGCC TCAAGCCCAC 
CGGTGGCACA CGCGATCCCG AGTTCGGGGA GGGGGTAGCA CCCTGGTCTA CACCTTCACA AACTAAGCCG AGTTCGGGTG 

+2 LHG PTPL LYR LGA VQNB ITL THP VTK 
3121 CCTCCATGGG CCAACACCCC TGCTATACAG ACTGGGCGCT GTTCAGAATG AAATCACCCT GACGCACCCA GTCACCAPAT 
GGAGGTACCC GGTTGTGGGG ACGATATGTC TCACCCGCGA CAAGTCTTAC TTTAGTGGGA CTGCGTGGGT CAGTGGTTTA 



+2YIMT CMS ADLE VVT STW VLVG GVL AAL 
3201 ACATCATGAC ATGCATGTCC GCCGACCTCG AGGTCGTCAC GAGCACCTGG GTGCTCGTTG GCGGCGTCCT GGCTGCTTTG 
TGTAG?ACTG TACGTACAGC CGGCTGGACC TCCAGCAGTG CTCGTGGACC CACGAGCAAC CGCCGCAGGA CCGACGAAAC 

*2AAYC LST CCV VIVG RVV LSG KPAI IPD 
3281 GCCGCGTATT GCCTGTCAAC AGGCTGCGTG GTCATAGTGG GCAGGGTCGT CTTGTCCGGG AAGCCGGCAA TCATACCTGA 
CGKGcI^AA ^GACAGTTG TCCGACGCAC CAGTATCACC CGTCCCACCA GAACAGGCCC TTCGGCCGTT AGTATGGACT 



^-2 REV LYRE FDE MEE CSQH LPY lEQ GMM 
3361 CAGGGAAGTC CTCTACCGAG ACTTCGATGA GATCGAAGAG TGCTCTCAGC ACTTACCGTA CATCGAGCAA GGGATGATGC 
GTCCCTTCAG GAGATGGCTC TCAAGCTACT CTACCTTCTC ACGAGAGTCG TG AATGGCAT GTAGCTCGTT CCCTACTAC. 

♦ 2LAE0 FKQ KALG LLQ TAS RQAE VIA PAV 
3441 TCGCCGAGCi GTTCAAGCAG AAGGCCCTCG GCCTOCTGCA GACCGCGTCC CGTCAGGCAG AGGTTATCGC CCCTGCTGTC 
AGCGGCTCGT CAAGTTCGTC TTCCGGGAGC CGGAGGACCT CTGGCGCAGG GCAGTCCGTC TCCAATAGCG GGGACGACAG 

*2 Q r H m QKt BTF WAKH MWN FIS GIQY ^Jl^^^ G 
3521 CAGACCAACT GGCAAAAACT CGAGACCTTC TGGGCGAAGC ATATCTGGAA CTTCATCAGT GGGATACAAT ACTTGGCGGG 
GTCTGGTTGA CCCTTTTTC GCTCTGGAAG ACCCGCTTCG TATACACCTT GAAGTAGTCA CCCTATGTrA TGAACCGCCC 



36oI' cttgtLvcg ctgcctg^taVc^cg^ca^ tg^^attg Jtggctttta^ca^^ ^S^t gI^ggtgIt 

GAACAGTTGC CACGCACCAT TGGGGCGGTA ACGAAGTAAC TACCGAAAAT GTCGACGACA GTGGTCGGGT GATTQGTGAI 

ififit^ gccSaaccct cctcttcaac atattggggg^ggtgggtggc tgcccacctc cccgcccccg*^gtgccgctac tgcctttgtg 
CGG^gI §SgSg rllll^ SS^CCACCG acgggtcgag cggccggggc cacggcgatg acggaaacac 



MAFT AAV TSP LTT 
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^2GAGL AGA AIG SVGL GKV LIO ILAG YGA 
3761 GGCGCTGGCT TAGCTGGCGC CGCCATCGGC AGTGTTGGAC TGGGGAAGGT CCTCATAGAC ATCCTTGCAG GGTATGGCGC 
CCGCGACCGA ATCCACCGCG GCGGTACCCG TCACAACCTG ACCCCTTCCA GGAGTATCTG TAGGAACGTC CCATACCGCG 



*-2 GVA G ALV AFK IMS GEVP STEl DLV NLL 
3841 GGGCGTGGCG GGAGCTCTT6 TGGCATTCAA GATCATGAGC GCTCAGGTCC CCTCCACGGA GGACCTGGTC AATCTACTGC 
CCCGCACCGC CCTCGAGAAC ACCGTAAGTT CTAGTACTCG CCACTCCAGG GGAGGTGCCT CCTGGACCAG TTAGATGACG 

t2PAIL SPG ALVV GVV CAA ILRR HVG PGE 
3921 CCGCCATCCT CTCGCCCGGA GCCCTCGTAG TCGGCGTGGT CTGTGCAGCA ATACTGCGCC GGCACGTTGG CCCGGGCGAG 
GGCGGTAGGA GAGCGGGCCT CGGGAGCATC AGCCGCACCA GACACGTCGT TATGACGCGG CCGTGCAACC GGGCCCGCTC 



+2GAVQ.WMN RLI AFAS RGN HVS PTHY VP£ 
4001 GGGGCAGTGC AGTGGATGAA CCGGCTGATA GCCTTCGCCT CCCCGGGGAA CCATCTTTCC CCCACGCACT ACGTGCCGGA 
CCCCGTCAC6 TCACCTACTT GGCCGACTAT CGGAAGCGGA GGGCCCCCTT GGTACAAAGG GGGTGCGTGA TGCACGGCCT 



+ 2 SDA AARV TAI LSS LTVT QLL RRL HQW 
4081 GAGCGATGCA GCTGCCCGCG TCACTGCCAT ACTCAGCAGC CTCACTGTAA CCCAGCTCCT GAGGCGACTG CACCAGTGGA 
CTCGCTACGT CGACGGGCGC AGTGACGGTA TGAGTCGTCG GAGTGACATT GGGTCGAGGA CTCCGCTGAC GTGGTCACCT 

+2rSSE CTT PCSG SWL RDI WDWI CEV LSD 
4161 TAAGCTCGGA GTGTACCACT CCATGCTCCG GTTCCTGGCT AAGGGACATC TGGGACTGGA TATGCGAGGT GTTGAGCGAC 
ATTCGAGCCT CACATGGTGA GGTACGAGGC CAACGACCGA TTCCCTGTAC ACCCTGACCT ATACGCTCCA CAACTCGCTG 

*2FKTW LKA KLM PQLP GIP TVS CQRG YKG 

BamHI 



4241 TTTAAGACCT GGCTAAAAGC TAAGCTCATG CCACAGCTGC CTGGGATCCC CTTTGTGTCC TGCCAGCGCG GGTATAACGG 
AAATTCTG6A CCGATTTTCG ATTCGAGTAC GGTGTCGACG GACCCTAGGG GAAACACAGG ACGGTCGCGC CCATATTCCC 



*2 VWR GOGI MHT RCH CGAE ITG HVK NGT 
4321 GGTCTGGCGA GGGGACGGCA TCATGCACAC TCGCTGCCAC TGTGGAGCTG AGATCACTGG ACATGTCAAA AACGGGACGA 
CCAGACCGCT CCCCTGCCGT AGTACGTGTG AGCGACGGTG ACACCTCGAC TCTAGTGACC TGTACAGTTT TTGCCCTGCT 

^•2 M RIV GPB TCRN MWS GTF PINA YTT GPC 
4401 TGAGGATCGT CGGTCCTAGG ACCTGCAGGA ACATGTGGAG TGGGACCTTC CCCATTAATG CCTACACCAC GGGCCCCTGT 
ACTCCTAGCA GCCAGGATCC TGGACGTCCT TGTACACCTC ACCCTGGAAG GGGTAATTAC GGATGTGGTG CCCGGGGACA 



^.2TPLP APN YTF ALWR VSA EEY VEIR QVG 
4481 ACCCCCCTTC CTGCGCCGAA CTACACGTTC GCGCTATGGA GGGTGTCTGC AGAGGAATAC GTGGAGATAA GGCAGGTGGG 
TGGGGGGAAG GACGCGGCTT GATGTGCAAG CGCGATACCT CCCACAGACG TCTCCTTATG CACCTCTATT CCGTCCACCC 



+2 OFH YVTG MTT DNL KCPC QVP SPE FFT 
4 561 GGACTTCCAC TACGTGACGG GTATGACTAC TGACAATCTT AAATGCCCGT GCCAGGTCCC ATCGCCCGAA TTTTTCACAG 
CCTGAAGGTG ATGCACTGCC CATACTGATG ACTGTTACAA TTTACGGGCA CGGTCCAGGG TAGCGGGCTT AAAAAGTGxC 



^2ELDG VRL HRFA PPC KPL LREE VSF RVG 
4 641 AATTGGACGG GGTGCGCCTA CATAGGTTTG CGCCCCCCTG CAAGCCCTTG CTGCGGGACG AGGTATCATT ^^^^l^^^ 
TTAACCTGCC CCACGCGGAT GTATCCAAAC GCGGGGGGAC GTTCGGGAAC GACCCCCTCC TCCATAGTAA GTCTCATCCT 



+ 2LHEY PVG SQL PCBP EPD VAV LTSM LTD 
4721 CTCCACGAAT ACCCGGTAGG GTCGCAATTA CCTTGCGAGC CCGAACCGGA CGTGGCCGTG TTGACGTCCA TGCTCACTGA 
GAGGTGCTTA TGGGCCATCC CAGCGTTAAT GGAACGCTCG GGCTTGGCCT GCACCGGCAC A ACTGCAGGT ACGAGTGACT 

♦ 2 PSH ITAE AAG BRL ARGS PPS VAS SSA 
4801 TCCCTCCCAT ATAACAGCAG AGGCGGCCGG GCGAAGGTTG GCGAGGGGAT CACCCCCCTC JCTCGCCAGC JCCTCGGCTA 
AGGGAGGGTA TATTGTCGTC TCCGCCGGCC CGCTTCCAAC CGCTCCCCTA GTGGGGGGAG ACACCGGTCG AGGAGCCGAT 
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♦2SQLS AP S LKAT CTA NHO SPOA ELI f-am 
4881 GCCAGCTATC CCCTCCATCT CTCAAGGCAA CTTCCACCGC TAACCATGAC TCCCCTGATG CTGAGCTCAT AGAnc;rranr 
CGGTCGATAG GCGAGGTAGA GAGTTCCGTT GAACGTGGCG ATTGGTACTG AGGGGACTAC GACTCGAGTA TCTCCGGTTG 

♦2LLi#R 0EM GGN 2TRV ESE NKV VILD S-D 
4961 CTCCTATGGA CGCAGGAGAT GGGCCGCAAC ATCACCAGGG TTGAGTCAGA AAACAAAGTG GTGATTCTGG ACTCCTTCGA 
GAGGATACCT CCGTCCTCTA CCCGCCGTTG TAGTGGTCCC AACTCAGTCT TTTGTTTCAC CACTAAGACC TGAgSaGCT 



"2 PLV AEED ERE ISV PAEI LRK SRR FAO 
^^^^ rSS^IISrS S^SSiSS'^^ ACGAGCGGGA GATCTCCGTA CCCGCAGAAA TCCTGCGGAA GTCTCGGAGA TTCGCCCAGG 
AGGCGAACAC CG CCTCCTCC TGCTCGCCCT CTAGAGGCAT GGGCGTCTTT AGGACGCCTT CAGAGCCTCT AAGCGGGTCC 

*2AL?V.WAR POYM PPL VET W. KKP DYE PPV 
5121 CCCTGCCCGT TTGGGCGCGG CCGGACTATA ACCCCCCGCT AGTGGAGACG TGGAAAAAGC CCGACTACGA ACCACCTG-^G 
GGGACGGGCA AACCCGCGCC GGCCTGATAT TGGGGGGCGA TCACCTCTGC ACCTTTTTCG GGCTGATGCT TGGTGGACAC 



>2 VHGC PLP PPK SPPV PPP RKK RTVV LTE 
5201 GTCCATGGCT GCCCGCTTCC ACCTCCAAAG TCCCCTCCTG TGCCTCCGCC TCGGAAGAAG CGGACGGTGG TCCTCACTGA 
CAGGTACCGA CGGGCGAAGG TGGAGGTTTC AGGGGAGGAC ACGGAGGCGG AGCCTTCTTC GCCTCCCACC AGGAGTGACT 

*2 STL STAL AEL ATR SFGS SST SGI TGD 
5281 ATCAACCCIA TCTACTGCCT TGGCCQAGCT CGCCACCAGA AGCTTTGGCA GCTCCTCAAC TTCCCGCATT ACGGGCGACA 
TAGTTGGGAT AGATGACOGA ACCGGCTCGA GCGGTGGTCT TCGAAACCGT CGACCACTTC AAGGCCGTAA TGCCCGCTGT 



*2NTTT SSB PAPS GCP POS DAES YSS MPP 
5361 ATACGACAAC ATCCTCTGAG CCCGCCCCTT CTGGCTGCCC CCCCGACTCC GRCGCTGAGT CCTATTCCTC CATGCCCCCC 
TATGCTGTTG TAGGAGACTC GGGCGGGCAA GACCGAC6GG GGGGCT6AGG CTGCGACTCA GGATAAGGAG GTACGGGGGG 

+2 LEGE PGD PDL SOGS WST VSS EANA EDV 
BamHI 



5441 CTGGAGGGGG AGCCTGGGGA TCCGGATCTT AGCGACGGGT CATGGTCAAC GGTCAGTACT GAGGCCAACG CGGAGGATGT 
GACCTCCCCC TCGGACCCCT AGGCCTA6AA TCGCTGCCCA GTACCACTTG CCAGTCATCA CTCCGGTTGC GCCTCCTACA 

^2 VCC SMSY SWT GAL VTPC AAE EQK LPr 
5521 CGTGTGCTGC TCAATGTCTT ACTCTTGGAC AGGCGCACTC GTCACCCCGT GCGCCGCGGA AGAACAGAAA CTGCCCATCA 
GCACACGACG AGTTACAGAA TGAGAACCTG TCCGCGTGAG CAGTGGGGCA CGCGGCGCCT TCTTGTCTTT GACGGGTAGT 



♦ 2NALS NSL LRHH NLV YST TSRS ACQ RQK 
5601 ATGCACTAAG CAACTCGTTG CTACGTCACC ACAATTTGGT GTATTCCACC ACCTCACGCA GTGCTTGCCA AAGGCAGAAG 
TACGTGATTC GTTGAGCAAC GATGCAGTGG TGTTAAACCA CATAAGGTGG TGGAGTGCCT CACCAACCGT TTCCGTCT7C 



♦2KVTF DRL QVL DSHY QOV LKE VKAA ASK 
5681 AAAGTCACAT TTGACAGACT GCAAGTTCTG GACAGCCATT ACCAGGACGT ACTCAAGGAG GTTAAAGCAG CGGCGTCAAA 
TTTCAGTGTA AACTGTCTGA CGTTCAAGAC CTGTCGGTAA TGGTCCTGCA TGAGTTCCTC CAATTTCGTC GCCGCAGTT7 



♦2 VKA NLLS VEE ACS LTPP HSA KSK FGY 
5*761 AGTGAAGGCT AACTTGCTAT CCGTAGAGGA AGCTTGCAGC CTGACGCCCC CACACTCAGC CAAATCCAAG TTTGGTTATG 
rCACTTCCGA TTGAACGATA GGCATCTCCT TCGAACGTCG GACTGCGGGG GTGTGAGTCG GTTTAGGTTC AAACCAATAC 



-••2GAKD VRC HARK AVT HIN SVWK DLL EDN 
5841 GGGCAAAAGA CGTCCGTTGC CATGCCAGAA AGGCCGTAAC CCACATCAAC TCCGTGTGGA AAGACCTTCT GGAAGACAAT 
CCCGTTTTCT GCAGGCAACG GTACGGTCTT TCCGGCATTG GGTGtAGTTG AGGCACACCT TTCTGGAAGA CCTTCTGTTA 



♦2VTPI OTT IMA KN EV F C V QPE KGGR KPA 
5921 GTAACACCAA TAGACACTAC CATCATGGCT AAGAACGAGG TTTTCTGCGT TCAGCCTGAG AAGGGGGGTC GTAAGCCAGC 
CATTGTGGTT ATCTGTGATG GTAGTACCGA TTCTTGCTCC AAAAGACGCA AGTCGGACTC TTCCCCCCAG CATTCGGTCG 
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♦2 RLI VFPD LGV RVC EKMA LYO VVT KLP 
6001 TCGTCTCATC GTGTTCCCCG ATCTGGGCGT GCGCGTGTGC GAAAAGATGG CrTTGTACGA CGTGGTTACA AftGCTCCCCT 
AGCAGAGTAG CACAAGGGGC TA6ACCCCCA CCCGCACACG CrTTTCTACC GAAACATGCT GCACCAATGT TTCGAGGGGA 

*2LAVM *GSS YGFQ YSP GQR VEFL V Q h WKS 

EcoRI 



6081 TGGCCGTGAT GGGAAGCTCC TACGGATTCC AATACTCACC AGGACAGCGG GrtGAATTCC TCGTGCAAGC GTGGAAGTCC 
ACCGGCACTA CCCTTCGAGG ATGCCTAAGG 7TATGAGTGG TCCTGTCGCC CAACTTAAGG AGCACGTTCG CACCTTCAGG 

♦ 2KKTP MGT SYO TRCT OST VTE SDIR T Z Z 
6161 AAGAAAACCC CAATGGGGTT CTCGTATGAT ACCCGCTGCT TTGACTCCAC AGTCACTGAG AGCGACATCC GTACGGAGGA 
TTCTTTTSGG GTTACCCCAA GAGOKTACTA TGGGCGACGA AACTGAGGTG TCAGTGACTC TCGCTGTAGG CATGCCTCCT 



^2 AIY QCCD LDP QAR VAIK SLT ERL YVG 
6241 GGCAATCTAC CAATCTTGTG ACCTCGACCC CCAAGCCCGC GTGGCCATCA AGTCCCTCAC CGAGAGGCTT TATGTTGGGG 
CCGTTAGATG GTTACAACAC TGGAGCTGGG GGTTCGGGCG CACCGGTAGT TCAGGGAGTG GCTCTCCGAA ATACAACCCC 



+2GPLT NSR GENC GYR RCR ASGV LTT SCG 
6321 GCCCTCTTAC CAATTCAAGG GGGGAGAACT GCGGCTATCG CAGGTGCCGC GCGAGCGGCG TACTGACAAC TAGCTGTGGT 
CGGGAGAATG GTTAAGTTCC CCCCTCTTGA CGCCGATAGC GTCCACGGCG CGCTCGCCGC ATGACTGTTG ATCGACACCA 

♦ 2NTLT CYI KAR AACR AAG LQD CTML VCG 
6401 AACACCCTCA CTTGCTACAT CAAGGCCCGG GCAGCCTGTC GAGCCGCAGG GCTCCAGGAC TGCACCATGC TCGTGTGTGG 
TTGTGGGAGT GAACGATGTA GTTCCGGGCC CGTCGGACAG CTCGGCGTCC CGAGGTCCTG ACGTGGTACG AGCACACACC 



4.2 DDL VVIC ESA GVQ EDAA SLR AFT EAM 
6481 CGACGACTTA GTCGTTATCT GTGAAAGCGC GGGGGTCCAG GAGGACGCGG CGAGCCTGAG AGCCTTCACG GAGGCTATGA 
GCTGCTGAAT CAGCAATAGA CACTTTCGCG CCCCCAGGTC CTCXTGCCCC GCTCGGACTC TCGGAAGTGC CTCCGATACT 

t2TRYS APP GDPP QPE YOL ELIT SCS SNV 
6561 CCAGGTACTC CGCCCCCCCr GGGGACCCCC CACAACCAGA ATACGACTTG GAGCTCATAA CATCATGCTC CTCCAACGTG 
GGTCCATGAG GCGGGGGGGA CCCCTGGGGG GTGTTGGTCT TATGCTGAAC CTCGAGTATT GTAGTACGAG GAGGTTGCAC 

+2SVAH DGA GKR VYYL TRD PTT PLAR AAW 
6641 TCAGTCGCCC ACGACGGCGC TGGAAAGAGG GtCTACTACC TCACCCGTGA CCCIACAACC CCCCTCGCGA GAGCTGCGTG 
AGTCAGCGGG TGCTGCCGCG ACCTTTCTCC CAGATGATGG AGTGGGCACT GGGATGTTGG GGGGAGCGCT CTCGACGCAC 



+ 2 ETA RHTP VNS WLG NIIM FAP TLH ARM 
6721 GGAGACAGCA AGACACACTC CAGTCAATTC CTGGCTAGGC AACATAATCA TGTTTGCCCC CACACTGTGG GCGAGGATGA 
CCTCTGTCGT TCTGTGTGAG GTCAGTTAAG GACCGATCCG TTCTATTAGT ACAAACGGGG GTGTGACACC CGCTCCTACT 



♦ 2ILMT HFF SVLI ARD QLE QALD CEI YGA 
6801 TACTGATGAC CCATTTCTTT AGCGTCCTTA TAGCCAGGGA CCAGCTTGAA CAGGCCCTCG ATTGCGAGAT CTACGGGGCC 
ATGACTACTG GGTAAAGAAA TCGCAGGAAT ATCGGTCCCT GGTCGAACTT GTCCCGGAGC TAACGCTCTA GATGCCCCGG 



+2 CYSI EPL DLP PIIQ RLtt CLS AFSL H5Y 
6881 TGCTACTCCA TAGAACCACT GGATCTACCT CCAATCATTC AAAGACTCCA TCCCCTCAGC GCATTTTCAC TCCACAGTTA 
ACGATGAGGT ATCTTGGTGA CCTAGATGGA GGTTAGTAAG TTTCTGAGGT ACCCCAGTCG CGTAAAAGTG AGGTGTCAAT 

♦2 SPG EINR VAA CLR KLGV PPL RAW RHR 
6961 CTCTCCAGGT GAAATCAATA GGGTGGCCGC ATGCCTCAGA AAACTTGGGG TACCGCCCTT GCGAGCTTGG AGACACCGGG 
GAGAGGTCCA CTTTAGTTAT CCCACCGGCG TACGGAGTCT TTTGA ACCCC ATGGCGGGAA CGCTCGAACC TCTGTGGCCC 

+2ARSV RAR LLAR GGR AAI CGKY LFN WAV 
7041 CCCGGAGCGT CCGCGCTAGG CTTCTGGCCA GAGGAGGCAG GGCTGCCATA TGTGGCAAGT ACCTCTTCAA CTGGGCAGTA 
GGGCCTCCCA GGCGCGATCC GAAGACCGGT CTCCTCCGTC CCGACGGTAT ACACCGTTCA TCGAGAAGTT GACCCGTCAT 
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♦2RTKL KLT PIA AAGQ LDL SGW FTAG YSG 
7121 AGAACAAAGC TCAAACTCAC TCCAATAGCG GCCGCTCGCC AGCTGGACTT G7CCGGCTGG TTCACGCCTG GCTACAGCGG 
TCTTGTTTCG ACTTTGAGTG AGGTTATCGC CGGCGACOGG TC6ACCTGAA CAGGCCGACC AAGTGCCCAC CGATGTCGCC 



♦ 2 GDI Y HSV SHA RPR WIWF CL L LLA AGV 
-7201 GGGAGACATT TATCACAGCG TGTCTCATCC CCGGCCCCGC TGGATCTGGT TTTGCCTACT CCTGCTTCCT GCAGGGGTAG 
CCCTCTGTAA ATACTGTCGC ACACAGTACG GGCCGGGGCG ACCTAGACCA AAACGGATGA GGACGAACGA CGTCCCCATC 



■^2 GIVL LPN R 
72 Bl GCATCTACCT CCTCCCCAAC CGATGAAGGT TGGGGTAAAC ACTCCGGCCT AAAAAAAAAA AAAAATCTAG AAAGGCGCGC 
CGTAGATGGA GGAGGGGTTG GCTACTTCCA ACCCCATTTC TGAGGCCGGA TTTTTTTTTT TTTTTAGATC TTTCCGCGCG 





R/tmHT Mlul 


7361 


CAAGATATCA AGGATCCACT ACGCGTTAGA GCTCGCTGAT CAGCCTCGAC TGTGCCTTCT AGTTGCCAGC CATCTGTTGT 
GTTC7ATAGT TCCTAGGTGA TGCGCAATCT CGAGCGACTA GTCGGAGCTG ACACGGAAGA TCAACGGTCG GTAGACAACA 


7441 


TTGCCCCTCC CCCGTGCCTT CCTTGACCCT GGAAGGTGCC ACTCCCACTG TCCTTTCCTA ATAAAAT6AG GAAATTGCAT 
AACGGGGAGG GGGCACGGAA GGAACTGGGA CCTTCCACGG TGAGGGTGAC AGGAAAGGAT TATTTTACTC C7TTAACGTA 


7521 


CGCATTGTCT GAGTAGGTGT CATTCTATTC TGGGGGGTGG GGTGGGGCAG GACAGCAAGG GGGAGGATTG GGAAGACAAT 
GCGTAACAGA CTCATCCACA 6TAAGATAAG ACCCCCCACC CCACCCCGTC CTGTCGTTCC CCCTCCTAAC CCTTCTGTTA 


7601 


AGCAG6CATG CTGGGGAGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC GCTCGGTCGT TCGGCTGCGG CGAGCGGTAT 
TCGTCCGTAC GACCCCTCGA GAA6GCCAAG GAGCGAG7GA CTGAGCGACG CGAGCCAGCA AGCCGACGCC GCTCGCCATA 


7681 


CAGCTCACTC AAAGGCGGTA ATACGGTTAT CCACAGAATC AGGGCATAAC GCAGGAAAGA ACATGTGAGC AAAAGGCCAG 
GTCGAGTGAG TTTCCGCCAT TATGCCAATA GGTGTCTTAG TCCCCTATTG CGTCCTTTCT TGTACACTCG TTTTCCGGTC 


7761 


CAAAAGGCCA GGAACCGTAA AAAGGCCGCG TTGCTGGCGT TTTTCCATAG GCTCCGCCCC CCTGACGAGC ATCACAAAAA 
GTTTTCCGGT CCTTGGCATT TTTCCCGCGC AACGACCGCA AAAACCTATC CGAGGCGGGG GGACTGCTCG TAGTGTTTTT 


7841 


TCGACGCTCA AGTCAGAGGT GGCGAAACCC GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC TCCCTCGTGC 
AGCTGCGAGT TCAGTCTCCA CCGCTTTGGG CTGTCCTGAT ATTTCTATGG TCCGCAAAGG GGGACCTTCG AGGGAGCACG 


7921 


GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC CGCCTTTCTC CCTTCGGGAA GCGTGGCGCT TTCTCAATGC 
CGAGAGGACA AGGCTGGGAC GGCGAATGGC CTATGGACAG GCGGAAAGAG GGAAGCCCTT CGCACCGCGA AAGAGTTAC^ 


8001 


TCACGCTGTA GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC GAACCCCCCG TTCAGCCCGA 
AGTGCGACAT CCATAGAGTC AACCCACATC CAGCAAGCGA CGTTCGACTC GACACACGTG CTTGGGGGGC AAGTCGuGCT 


8081 


CCGCTGCGCC TTATCCGGTA ACTATCGTCT TGAGTCCAAC CCGGTAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 
GGCGACGCGG AATAGGCCAT TGATAGCAGA ACTCAGGTTG GGCCATTCTG TGCTGAATAG CGGTGACCGT CGTCv.GTGAC 


8161 


GTAACAGGAT TAGCAGAGCG AGGTATGTAG GCGGTGCTAC AGAGTTCTTG AAGTGGTGGC CTAACTACGG CTACACTAGA 
CATTGTCCTA ATCGTCTCGC TCCATACATC CGCCACGATG TCTCAAGAAC TTCACCACCG GATTGATGCC GATGTGATCT 


8241 


AGGACAGTAT TTGGTATCTG CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT CCGGCAAACA 
TCCTGTCATA AACCATAGAC GCGAGACGAC TTCGGTCAAT GGAAGCCTTT TTCTCAACCA TCGAGAACTA GGCCGTTTGT 


8321 


K^r^mrrnrT nrrAnrGGTG GTTTTTTTGT TTGCAAGCAG CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT 
SIISCSS U^^C GTCTAATGCG CGTCTTTTTT TCCTAGAGTT CTTCTAGGAA 


8401 


TGATCTTTTC TACGCGGTCT CACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG TCATGAGATT ATCAAAAA 
A«MAAAAG atgccccaS CTGCGACTCA CCTTGCTTTT CAOTGCAATT CCCTAAAACC AGTACTCTAA TAGTTTTTCC 
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8481 


ATCTTCACCT AGATCCTTTT AflATTAAAA^ Tw\A\»TirTH Hi\i»»ru*n»i/\ #\i\^Jim/\iMi \jnv*iM«/*v,i i ivsA^Avy 
TAGAAGTGGA TCTAGGAAAA TTTAATTTTT ACTTCAAAAT TTAGTTAGAT TTCATATATA CTCATTTGAA CCAGACTGTC 


8S61 


TTACCAATGC TTAATCAGTG AGGCACCTAT CTCAviUtiAiL- ivjivirtiiiv- v>i iv-rvivvAi wjiiuu^iun v_i^\_\^\.oiv.^j 
AATCCTTACG AATTAGTCAC TCCGTGGATA GAGTCGCIAG ACAGATAAAG CAAGTAGGTA TCAACGGACT GAGGGGCAGC 


8641 


TGTAGATAAC TACGATACGG GAGGGCTTAC CATCTGCSCCC CAGTGCTijCA ai i uavjAWI-wav^vi v^i^av-cuvj^ i 
ACATCTATTG ATGCTATGCC CTCCCGAATG GTAGACCGGG GTCACGACGT TACTATGGCG CTCTGGGTGC GAGTGGCCGA 


8721 


CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG AGCGCAGAAG tw»XI-(-TW-A AL.iirMi^i,ii uuiuv.Ai^^A 
GGTCTAAATA GTCGTTATTT GGTCGGTCGG CCTTCCCGGC TCGCGTCTTC ACCAGGACGT TGAAATAGGC GGAGGTAGGT 


8801 


GTCTATTAAT TGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GrTTvjCGv-nA ^..ui hjA hjv-u 
CACATAATTA ACAACGGCCC TTCGATCTCA TTCATCAAGC GGTCAATTAT CAAACGCGTT GCAACAACGG TAACGATGTC 


8881 


GCATCGTGGT GTCACGCTCG TCGTTTGGTA TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGGCGAGr fAQ,Ai:GAix.i- 
CGTAGCACCA CAGTGCGAGC AGCAAACCAT ACCGAAGTAA GTCGAGGCCA AGGGTTGCTA GTTCCGCTCA ATGTACTAGG 


8961 


CCCATGTTGT GCAAAAAAGC GGTTAGCTCC TTCGGTCCTC CGATCGTTGT CAGAAGlAAvj i i WjCv^^jUAJj iuiiaiv-aui 
GGGTACAACA CGTTTTTTCG CCAATCGAGG AAGCCAGGAG GCTAGCAACA GTCTTCATTC AACCGGCGTC ACAATAGTGA 


9041 


CATGGTTATG GCAGCACTGC ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT wagiactuaa 
GTACCAATAC CGTCGTGACG TATTAAGAGA ATGACAGTAC GGTAGGCATT CIACGAAAAG ACACTGACCA CTCATGAGTT 


9121 


CCAAGTCATT CTGAGAATAG TGIATGCGGC GACCGAGTTG CTCTTGCCCG GCGTCAATAC GGGATAATAK. ui*u(»v-»wAum i 
GGTTCAGTAA GACTCTTATC ACATACGCCG CTGGCTCAAC GAGAACGGGC CGCAGTTATG CCCTATTATG GCGCGGTGTA 


9201 


AGCAGAACTT TAAAAGTGCT CATCATTGGA AAACGTTCTT CGGGGCGAAA ACTCTCAAGG ATCTTACCGt ^^^^^f-^f^.^ 
TCGTCTTGAA ATTTTCACGA GTAGTAACCT TTTGCAAGAA GCCCCGCTTT TGAGAGTTCC TAGAATGGCG ACAACTCTAG 




CAGTTCGATG TAACCCACTC GTGCACCCAA CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 
GTCAAGC7AC ATTGGGTGAG CACGTGGGTT GACTAGAAGT CGTAGAAAAT GAAAGTGGTC GCAAAGACCC ACTCGTTTT. 


9361 


CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT TGAATACTCA TACTCTTCCT TTTTCAATA7 
GTCCTTCCGT TTTACGGCGT TTTTTCCCTT ATTCCCGCTG TGCCTTTACA ACTTATGAGT ATGAGAAGGA AAAAGTTA.A 


9441 


TATTGAAGCA TTTATCAGGG TTATTGTCTC ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC AAATAGGGGT 
ATAACTTCGT AAATAGTCCC AATAACAGAG TACTCGCCTA TGTATAAACT TACATAAATC TTTTTATTTG TTTATCCCCA 


9521 


TCCGCGCACA TTTCCCCGAA AAGTGCCACC TGACGTCTAA GAAACCATTA TTATCATGAC ATTAACCTAT AAiW^ 
AGGCGCGTGT AAAGGGGCTT TTCACGGTGG ACTGCAGATT CTTTGGTAAT AATAGTACTC TAATTGGATA TTTTTATCC^ 


9601 


GTATCACGAG GCCCTTTCGT C 
CATAGTGCTC CGGGAAAGCA G 
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1 


TCGCGCGTTT CG6TGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA CAGCTTGTCT GTXAGCGGAT 
AGCGCGCAAA GCCACTACTC CCACTTTTGG AGACTGTGTA CGTCGAGGGC CTCTGCCAGT GTCGAACAGA CATTCGCCTA 


81 


gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 
cggccctcgt ctgttcggqc agtcccgcgc agtcgcccac aaccgcccac 


TCGGGGCTGG CTTAACTATG 
AGCCCCGACC GAATT6ATAC 


CGGCATCAGA 
GCCGTAGTCT 


161 


StuI 

GCAGATIGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA AAAGCCTAGG CCTCCAAAAA 
CCTCTAACAT GACTCTCACC TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT 


AGCCTCCTCA 
TCGCAGGAGT 


CTAC7TCT3G 
GATGAAGACC 


241 


AA7AGCTCAC AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG AATGGGCCGA 
TTATCGAGTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT ACCCCGCCTC TTACCCGCCT 


321 


ACTGGGCGGG GAGGGAATTA TTGGCTATTG GCCATtGCAT ACGTTGTATC 
TGACCCGCCC CTCCCTTAAT AACCGATAAC CGGTAACGTA TGCAACATAG 


TATATCATAA 
ATATAGTATT 


TATGTACATT 
ATACATGTAA 


TATA7TGGCT 
ATATAACCGA 


401 


CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT 


TAGTAATCAA 
ATCATTAGTT 


TTACGGGGTC 
AATGCCCCAG 


ATTAGTTCAT 
TAATCAAGTA 


481 


AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC 
TCCGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG 


TGGCTGACCG 
ACCGACTGGC 


CCCAACCACC 
GGGTTGCTGG 


CCCGCCCATT 
GGGCGGGTAA 


561 


GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGXTAT CCCTGAAAGG 


ATTGACGTCA 

TAACTGCAGT 


ATGGGTGGAG 

TACCCACCTC 


TATTTACGGT 
ATAAATGCCA 


641 


AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC 
TTTGACGGGT GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG 


CCTATTGACG 
GGATAACTGC 


TCAATGACGG 
AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


721 


GCCTGGCATT ATGCCCAGTA CATGACCTTA CGGGACTTTC CTACTTGGCA 
CGGACCGTAA TACGGGTCAT GTACTGGAAT GCCCTGAAAG GATGAACCGT 


GTACATCTAC 
CATGTAGATG 


GTATTAGTCA 
CATAATCACr 


TCGCTATTAC 
AGCGATAATG 


801 


CATGGTGATG CGGTTTTGGC AGTACACCAA TGGGCGTGGA TAGCGGTTTG 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT 
TAAAGGTTCA 


CTCCACCCCA 
GAGGTGGGGT 


881 


TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA 
AACTGCAGTT ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT 


AAATGTCGTA 
TTTACAGCAT 


ATAACCCCGC 
TATTGGGGCG 


CCCGTTGACG 
GGGCAACTGC 


961 


CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTCG 
GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGAGC 


TTTAGTGAAC 
AAATCACTTG 


CGTCAGATCG 
GCAGTCTAGC 


CCTGGAGACG 
GGACCTCTGC 


1041 


CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 
GGTAGGTCCG ACAAAACTGG AG6TATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC 


GGAACGGTGC 
CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


1121 


GGATTCCCCG TGCCAAGAG7 GACGTAAGTA CCGCCTATAG ACTCTATAGG 
CCTAAGGGGC ACGGTTCTCA CTGCATTCAT GGCGGATATC TGAGATATCC 


CACACCCCTT 
GTGTGGGGAA 


TGGCTCTTAT 
ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA TCGTATACCT TAGCCTATAG GTGTGGGTTA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGCAATAC GATATCCACT ACCATATCGA ATCGGATATC CACACCCAAT 


1281 


TTGAOCATTA TTGACCACTC CCCTATTGGT GACGATACTT TCCATTACTA 
AACTOGTAAT AACT6GTGAG GGGATAACCA CTGCTATGAA AGGTAATGAT 


ATCCATAACA 
TAGGTATTGT 


TGGCTCTTTG 
ACCGAGAAAC 


CCACAACTAT 
GGTGTTGATA 


1361 


CTCrATTGGC TATATGCCAA TACTCTGICC TTCAGACTCT GACACGGACT 
GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA CTGTGCCTGA 


CTGTATTTTT 
GACATAAAAA 


ACAGGATGGG 
TGTCCTACCC 


GTCCATTTAT 
CAGGTAAATA 
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1441 TATTTACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA TAGCGTGGGA TCTCCGACAt 
ATAAATGTTT AAGTGTATAT GTTGTTXSCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT ATCGCACCCT AGAGGCTGTA 



1521 CTCGGCTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCIT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGCCCATGC ACAAGGCCTG TACCCGACAA GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 



1601 GCGGCTCATG GTCGCTC6GC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA CTTAGGCACA GCACAATGCC CACCACCACC 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT GAATCCGTGT CGTGTTACGG GTGGTGGTGG 



1681 AGTGTGCCGC ACAAGGCCGT GGCGGTAGGG TAT6TGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT GGACGCAGAT 
TCACACGGCG TGTTCCGGCA CCGCCATCCC ATACACAGAC TTTTACTCGA GCCTCTAACC CGAGCCTGGA CCTGCGTCTA 



176: GGAACACTTA AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA 



1841 TGCGGTGCTG TTAACGGTGG AG6GCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG CGCCACCAGA CATAATAGCT 
ACGCCACGAC AATTGCCACC TCCCGTCACA TCACACTCGT CATGAGCAAC GACGGCGCGC GCGGTGGTCT GTATTATCGA 



+2 


M A A 

EcoRI 


1921 


GACAGACTAA CAGACTGTTC CTTTCCATGG GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCACX ATGGCTGCAT 
CTGTCTGATT GTCTGACAAG GAAAGGTACC CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTGG TACCGACGTA 


1-2 
2001 


YAAQ GYK VLVL MPS VAA TLGF GAY MSK 
ATGCAGCrCA GG6CTATAAG 6TGCTAGTAC TCAACCCCTC TGTTGCTGCA ACACTGGGCT TTGGTGCTTA CATGTCCAAG 
TACGTCGAGT CCCGATATTC CACGATCATG AGTTGGGGAG ACAACGACGT TGTGACCCGA AACCACGAAT GTACAGGTTC 


+2 
2081 


AHGI DPN IRT GVRT ITT GSP ITYS TYG 
GCTCATGGGA TCGATCCTAA CATCAGGACC GGGGTGAGAA CAATTACCAC TGGCAGCCCC ATCACGTACT CCACCTACGG 
CGAGTACCCT AGCTAGGATT GTAGTCCTGG CCCCACTCTT GTTAATGGTG ACCGTCGGGG TAGTGCATGA GGTGGATGCC 


2161 


KFL ADGG CSG GAY Dili CDE CHS TDA 
CAAGTTCCTT GCCGACGGCG GGTGCTCGGG GGGCGCTTAT GACATAATAA TTTGTGACGA GTGCCACTCC ACGGATGCCA 
GTTCAAGGAA CGGCTGCCGC CCACGAGCCC CCCGCCAATA CTGTATTATT AAACACTGCT CACGGTGAGG TGCCTACGGT 


+2 
2241 


TSIL GIG rVLO QAE TAG ARtV VLA TAT 
CATCCATCTT GGGCATTGGC ACTGTCCTTG ACCAAGCAGA GACTGCGGGG GCGAGACTGG' TTGTGCTCGC CACCGCCACC 
GTAGGTAGAA CCCGTAACCG TGACAGGAAC TGGTTCGTCT CTGACGCCCXT CGCTCTGACC AACACGAGCG GTGGCGGTGG 


♦2 
2321 


PPGS VTV PHP MIEE VAL STT GEIP FYG 
CCTCCGGGCT CCGTCACTGT GCCCCATCCC AACATCGAGG AGGTTGCTCT GTCCACCACC GGAGAGATCC CTTTTTACGG 
GGAGGCCCGA GGCAGTGACA CGGGGTAGGG TTGTAGCTCX TCCAACGAGA CAGGTGGTGG CCTCTCTAGG GAAAAATGCC 


*2 
2401 


KAI PLEV IKG GRH LIFC HSK KKC DEL 
CAAGGCTATC CCCCTCGAAG TAATCAAGGG GGGGAGACAT CTCATCTTCT GTCATTCAAA GAAGAAGTGC GACGAACTCG 
GTTCCGATAG GGGGAGCTTC ATTAGTTCCC CCCCTCTGTA GAGTAGAAGA CAGTAAGTTT CTTCTTCACG CTGCTTGAGC 


2481 


AAKL VAL GINA VAY YRG LDVS VIP TSG 
CCGCAAAGCT GGTCCCATTG GGCATCAATG CCGTGGCCTA CTACCGCGGT CTTGACGTGT CCGTCATCCC GACCAGCGGC 
GGCGTTTCGA CCAGCGTAAC CCGTAGTTAC GGCACCGGAT GATGGCGCCA GAACTGCACA GGCAGTAGGG CTGGTCGCCG 


4-2 
2S61 


DVVV VAT DAL MTGY TGD T 0 S VIDC NTC 
GATGTTGTCG TCGTGGCAAC CGATGCCCTC ATGACCGQCT ATACCGGCGA CTTCGACTCG GTGATAGACT GCAATACGTG 
CTACAACAGC AGCACCGTTG GCTACGGGAG TACTGGCCGA TATGGCCGCT GAAGCTGAGC CACTATCTGA CGTTATGCAC 
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♦2 VTQ TVOr SLO PTF TIET ITL PQO AVS 
2641 TCTCACCXyVG ACAGTCGATT TCAGCCTTGA CCCTACCTTC ACCATTGAGA CAATCACGCT CCCCCAAGAT GCTGTCTCCC 
ACAGTGGGTC TGTCAGCTAA A6TCGGAACT GGGATGGAAG TGGTAACTCT GTTAGTGCGA GGGGCTTCTA CGACAGAGGG 



♦2RTQR -R GR TGRG KPG lYR PVAP GER PSG 
2721 GCACTCAACG TCGGGGCAGG ACTGGCAGGG GGAAGCCAGG CATCTACAGA TTTGTGGCAC CGGGGGAGCG CCCCTCCGGC 
CGTGAGTTGC AGCCCCGTCC TGACCGTCCC CCTTCGGTCC GTAGATGTCT AAACACCGTG GCCCCCTCGC GCGGAGGCCG 



^2 H r 0 S SVL CEC YDAG CAW YEL TPAE TTV 
2801 ATGTTCGACT CGTCCGTCCT CTGTGAGTGC TATGACGCAG GCTGTGCTTG GTATGAGCTC ACGCCCGCCG AGACTACAGT 
TACAAGCTGA GCAGGCAGGA GACACTCACG ATACTGCGTC CGACACGAAC CATACTCGAG TGCGGGCGGC TCTGATGTCA 



♦2 RLR AYMN TPG LPV CQDK .LEF MSG VFT 

Stu: 

2881 TAGGCTACGA GCGTACATGA ACACCCCGGG GCTTCCCGTG TGCCAGGACC ATCTTGAATT TTGGGAGGGC GTCTTTACAG 
ATCCGATGCT CGCATGTACT TGTGGGGCCC CGAAGGGCAC ACGGTCCTGG TAGAACTTAA AACCCTCCCG CAGAAATGTC 



♦2GLTH IDA HFLS OTK QSG ENLP YLV AYQ 
StuI 

2961 GCCTCACTCA TATAGATGCC CACTTTCTAT CCCAGACAAA GCAGAGTGGG GAGAACCTTC CTTACCTGGT AGCGTACCAA 

CGGAGTGAGT ATATCTACGG GTGAAAGATA GGGTCTGTTT CGTCTCACCC CTCTTGGAA6 GAATGGACCA TCGCATGGTT 



*2ATVC ARA QAP PPSH OQM WKC LIRL KPT 
3041 GCCACCGTGT GCGCTAGGGC TCAAGCCCCT CCCCCATCGT GGGACCAGAT GTGGAAGTGT TTGATTCCCC TCAAGCCCAC 
CGGTGGCACA CGCGATCCCX» AGTTCGGGGA GGG6GTAGCA CCCTGGTCTA CACCTTCACA AACTAAGCGG AGTTCGGGTG 



+ 2 LHG PTPL LYR LGA VQNE ITL THP VTK 
3121 CCTCCATGGG CCAACACCCC TGCTATACAG ACTGGGCGCT GTTCAGAATG AAATCACCCT GACGCACCCA GTCACCAAAT 
GGAGGTACCC GGTTGTGGGG ACGATATGTC TGACCCGCGA CAAGTCTTAC TTTAGTGGGA CTGCGTCGGT CAGTGGTTTA 



f2YIMT CMS ADLE VVT STW VLVG GVL AAL 
3201 ACATCATGAC ATGCATGTCG GCCGACCTCG AGGTCGtCAC GAGCACCTGG GTGCTCGTTG GCGGCGTCCT GGCTGCTTTG 
TGTAGTACTG TACGTACAGC CGGCTGGACC TCCAGCAGTG CTCGTGGACC CACGAGCAAC CGCCGCAGGA CCGACGAAAC 

♦ 2AAYC LST GCV VIVG RVV LSG KPA I IPD 
3281 GCCGCGTATT GCCTGTCAAC AGGCTGCGTG GTCATAGTGG GCAGGGTCGT CtTGTCCGGG AAGCCGGCAA TCATACCTGA 
CGGCGCATAA CGGACAGTTG TCCGACGCAC CAGTATCACC CGTCCCAGCA GAACAGGCCC TTCGGCCGn AGTATGGACT 



+2 REV LYRE FDE HEE CSQH LPY lEQ G M 
3361 CAGGGAAGTC CTCTACCGAG AGTTCGATGA GATGGAAGAG TGCTCTCAGC ACTTACCGTA CATCGAGCAA GGGATGA.^v 
GTCCCTTCAG GACATCGCTC TCAAGCTACT CTACCTTCTC ACGAGAGTCG TGAATGGCAT GTAGCTCGTT CCCTACTACG 



^2LAEQ FKQ KALG LLQ TAS RQAE VIA PAV 
3441 TCGCCGAGCA GTTCAAGCAG AAGGCCCTCG GCCTCCTGCA GACCGCGTCC CGTCAGGCAG AGGTTATCGC CCCTGCTGTC 
AGCGGCTCGT CAAGTTCGTC TTCCGGGAGC CGGAGGACGT CTGGCGCAGG GCAGTCCGTC TCCAATAGCG GGGACGACAG 

^2QTNW QKL ETF WAKH MWN FIS GIQY LAG 
3521 CAGACCAACT GGCAAAAACT CGAGACCTTC TGCCCGAAGC ATATGTGGAA CTTCATCAGT GG^ATACAAT ACTTGGCGGG 
GTCTGGTTGA CCGTTTTTGA GCTCTGGAAC ACCCCCTTCG TATACACCTT GAAGTAGTCA CCCT ATGTTA TGAACCGCCC 

+2 LST LPGN PAI ASL MAFT AAV TSP LTT 
3601 CTTGTCAAC6 CTGCCTGGTA ACCCCGCCAT TGCTTCATTG ATGGCTTTTA CAGCTGCTGT CACCAGCCCA CTAACCACTA 
GAACAGTTCC GACGGACCAT TGGGGCGGTA ACGAAGTAAC TACCGAAAAT GTCGACGACA GTG GTCGGGT .GATTGGTGAT 

♦2SQTL LFH ILGG MVA AQL AAPG AAT AFV 
3681 GCCAAACCCT CCTCTTCAAC ATATTGGGGG GGTG6GTGGC TGCCCACCTC GCCGCCCCCG GTGCCQCTAC JGCCTTTCTG 
CGGTTTGGGA GGAGAAGTTG TATAACCCCC CCACCCACCG ACGGCTCGAG CGGCGGGGGC CACGGCGATG ACGGAAACAC 
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i-rct^ ^^^^^^ AGA AIG. SVGL GKV LID ILAG YGA 
3761. GGCGCTGGCT TAGCTGGCGC CGCCATCGGC AGTGTTGGAC TGGGGAAGGT CCTCATAGAC ATCCTTGCAG GGTATGGCGC 
CCGCGACCGA ATCGACCGCG GCGGTAGCCG TCACAACCTG ACCCCTTCCA OGAGTATCTG TAGGAACGTC CCATACCGCG 



^2 GVA GALV AFK IMS GBVP STE DLV NLL 
3841 GGCCGTGGCG GGAGCTCTTG TGGCATTCAA GATCATGAGC CCTGAGGTCC CCtCCACCGA GGACCTGGTC AATCTACTGC 
CCCGCACCGC CCTCGAGAAC ACCGTAAGTT CTAGTACTCG CCACTCCAGG GGAGCTGCCT CCTCGACCAG TTAGATGACG 



^2 9 A X L SPG ALVV G V V CAA ILRR HVG PGE 
3921 CCGCCATCCT CTCGCCCGGA GCCCTCGTAG TCGGCGTGGT CTGTGCAGCA ATACTGCGCC GGCACGTTGG CCCGGGCGAG 
GGCGGTAGGA GAGCGGGCCT CGGGAGCATC AGCCGCACCA GACACGTCGT TATGACGCGG CCGTGCAACC GGGCCCGCTC 



+ 2GAVQ WMN RLI AFAS RGN HVS PTHY VPE 
4001 GGGGCAGTGC A*GTGGATGAA CCGGCTGATA GCCTTCGCCT CCCGGGGGAA CCATGTTTCC CCCACGCACT ACGTGCCGGA 
CCCCGTCACG TCACCTACTT GGCCGACTAT CGGAAGCGGA GGGCCCCCTT GGTACAAAGG GGGTCCGTGA TGCACGGCCT 



*2 SDA AARV TAI LSS LTVT QLL RRL HQW 
4081 GAGCGATGCA GCTGCCCGCG TCACTGCCAT ACTCAGCAGC CTCACTGTAA CCCAGCTCCT GAGGCGACTG CACCAGTGGA 
CTCGCTACGT CGACGGGCGC AGTGACGGTA TGAGTCGTCG GAGTGACATT GGGTCGAGGA CTCCGCTGAC GTGGTCACCT 



+2ISSE CTT PCSG SWL RDI WDWI CEV LSD 
4161 TAAGCTCGGA GTGTACCACT CCATGCTCCG GTTCCTGGCT AAGGGACATC TGGGACTGGA TATGCGAGGT GTTGAGCGAC 
ATTCGAGCCT CACATGGTGA GGTACGAGGC CAAGGACCGA TTCCCTGTAG ACCCTGACCT A7ACGCTCCA CAACTCGCTG 



♦2FKTW LKA KLM PQLP GIP T VS CQRG YKG 

BaxnHI 

4241 TTTAAGACCT GGCTAAAAGC TAAGCTCATG CCACAGCTGC CTGGGATCCC CTTTGTGTCC TGCCAGCGCG GGTATAAGGG 
AAATTCTGGA CCGATTTTCG ATTCGAGTAC GGTGTCGACG GACCCTAGGG GAAACACAGG ACGCTCGCGC CCATATTCCC 



♦2 VWR GDGI MHT RCH CGAE ITG HVK NGT 
4321 GGTCTGGCGA GGGGACGGCA TCATGCACAC TCGCTGCCAC TGTGGAGCTG ACATCACTGG ACATGTCAAA AACGGGACGA 
CGAGACCGCT CCCCTGCCGT AGTACGTGTG AGCGACGGTG ACACCTCGAC TCTAGTGACC TGTACAGTTT TTGCCCTGCT 



♦2MRIV GPR TCRN MMS GTF PINA YTT GPC 
4401 TGAGGATCGT CGGTCCTAGG ACCTGCAGGA ACATGTGGAG TGGGACCTTC CCCATTAATG CCTACACCAC GGGCCCCTGT 
ACTCCTAGCA GCCAGGATCC TGGACGTCCT TGTACACCTC ACCCTGGAAG GGGTAATTAC GGATGTGGTG CCCGGGGACA 



♦2TPLP APM YTF ALMR V 5 A EEY VEIR QVG 
4481 ACCCCCCTTC CTGCGCCGAA CTACACGTTC GCGCTATGGA GGGTGTCTGC AGAGGAATAC GTGGAGATAA GGCAGG7GGG 
TGGGGGGAAG GACGCGGCTT GATGTGCAAG CGCCATACCT CCCACAGACG TCTCCTTATG CACCTCTATT CCGTCCACCC 



*2 DFH YVTG MTT OML KCPC QVP SPE FFT 
4561 GGACTTCCAC TACGTGACGG GTAT6ACTAC TGACAATCTT AAATGCCCGT GCCAGGTCCC ATCGCCCGAA TTTTTCACAG 
CCTGAAGGTG ATGCACTGCC CATACTGATG ACTGTXAGAA TTTACGGGCA CGGTCCAGGG TAGCGGGCTT AAAAAGTG7C 



+2ELDG VRL HRFA PPC KPL LREE VSF RV5 
4641 AATTGCACGG GGTGCCCCTA CATAGGTTTG CGCCCOCCTG CAAGCCCTTG CTGCGCGAGG AGGTATCATT CAGAGTAGGA 
TTAACCTGCC CCACGCGGAT GTATCCAAAC GCGGGGGGAC GTTCGGGAAC GACGCCCTCC TCCATAGTAA GTCTCATCCT 



♦2LHEY PVG SOL PCEP BPD VAV LTSM LTD 
4721 CTCCACGAAT ACCCGGTAGG GTCGCAATTA CCTTGCGAGC CCGAACCGGA CGTGGCCGTG TTGACGTCCA TGCTCACTGA 
GAGGTCCTTA TGGGCCATCC CAGCGTTAAT GGAACGCTCG GGCTTGGCCT GCACCGGCAC AACTGCAGGT ACGAGTGACT 



^2 PSH ZTAE AAG RRL ARGS PP5 VAS SSA 
4801 TCCCTCCCAT ATAACAGCAG ACGCGGCCGG GCGAAGGTTG GCGAGGGGAT CACCCCCCTC TGTGGCCAGC TCCTCGGCTA 
AGGGAGGGrA TATTGTCGTC TCCGCCOGCC CGCTTCCAAC CGCTCCCCTA GTGGG6GGAG ACACCGGTCG AGGAGCCGAT 
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>«ot^ SQLS APS LKAT CTA NHD SPOA ELl eam 
4881 GCOJGCTATC CGCTCCArCT CTCAAGGCAA CTTGCACCGC TAACCATCAC TCCCCTCATG CTGAGCTCAr AGAGGCCAAC 
CGGTCGATAG GCGAGGTAGA GAGTTCXGTT GAACGTGGCG ATTGGTACTG AGGGGACTAC GACTCGAGTA TCTCCGGTTG 



/2 LLWR QEM GGN ITRV ESE NKV VILD SFn 
5ISS^*^^^ GGCACGAGAT CCCCGGCAAC ATCACCAGGG TTGAGTCAGA AAACAAAGTG GTGATTCTGG ACTCCTTCGA 
CAGGATACCT CCGTCCTCTA CCCGCCGTTG TAGTGGTCCC AACTCAGTCT T TTGTTTCAC CAcVaAGACC ^gIggIIgCT 

♦2 PLV AEED ERE ISV PAEI LRK SRR FAO 

GCGGACGAGG ACGACCCGGA GATCTCCGTA CCCGCAGAAA rCCTGCGGAA GTCTCGGAGA TTCGCCCAGG 
AGGCGAACAC CGCCTCCTCC TGCTCGCCCT CTAGAGGCAT GGCCGTCTTT AGGACGCCTT cagagcctct aagcgggtcc 



+ 2ALPV MAR PDYN PPL VET M. KKP DYE ??7 

5121 ccctgcccgt ttgggcgccc ccggactata accccccgct agtggagacg tggaaaaagc ccgactacga ACCACCTGTG 

GGGACGGGCA AACCCGCGCC GCCCTGATAT TCCGGGGCGA TCACCTCTGC ACCTTTTTCG GGCTGATGCT TGGTGCACAC 

♦2VHGC PLP PPK SPPV PPP RKK RTVV LTE 
S201 GTCCATGSCT GCCCGCTTCC ACCTCCAAAG TCCCCTCCTG TGCCTCCGCC TCGGAAGAAG CGGACGGTGG TCCTCACTGA 
CACGTACCGA CGGGCGAAGG TGGAGGTTTC AGGGGAGCAC ACGGAGGCGG agccttcttc gcctgccacc aggagtgact 

e.«T^ STL STAL AEL ATR SFGS SST SGI TGD 
5281 ATCAACCCTA TCTACTGCCT TGGCCGAGCT CGCCACCAGA AGCTTTCGCA GCTCCTCAAC TTCCGGCATT ACGGGCGACA 
TAGTTGGGAT AGATGACGGA ACCGGCTCGA CCGGTGCTCT TCGAAACCGT CGAGGAGTTG AAGGCCGTAA TGCCCGCTGT 

*2 NTTT SSE PAPS GCP PDS d'aES YSS MPP 
5361 ATACGACAAC ATCCTCTGAG CCCGCCCCTT CTGGCTGCCC CCCCGACTCC GACGCTGAGT CCTATTCCTC CATGCCCCCC 
TATGCTGTTG TAGGAGACTC GGGCGGGGAA GACCGACGGG GGGGCTGAGG CTGCGACTCA GGATAAGGAG GTACGGGGGG 

t2LBGE PGO POL SOGS WST VSS EANA EOV 
BamHI 



5441 CTGGAGGGGG AGCCTGGGGA TCCGGATCTT AGCGACGGGT CATGGTCAAC GGTCAGTAGT GAGGCCAACG CGGAGGATGT 
GACCTCCCCC TCGGACCCCT AGGCCTAGAA TCGCTGCCCA GTACCAGTTG CCAGTCATCA CTCCGGTTGC GCCTCCTACA 

*2 VCC SMSY SWT GAL VTPC AAE EQK LPI 
5521 CGrGTGCTGC TCAATGTCTT ACTCTTGGAC AGGCGCACTC GTCACCCCGT GCGCCGCGGA AGAACAGAAA CTGCCCATCA 
GCACACGACG AGTTACAGAA TGAGAACCTG TCCGCGTGAG CAGTGGGGCA CGCGGCGCCT TCTTGTCTTT GACGGGTAGT 



♦2NALS NSL LRHH NLV YST TSRS ACQ RQK 
5601 ATGCACTAAG CAACTCGTTG CTACGTCACC ACAATTTGGT GTATTCCACC ACCTCACGCA GTGCTTGCCA AAGGCAGAAG 
TACGTGATtC CTTGAGCAAC GATGCAGTGG TGTTAAACCA CATAAGGTGC TGGAGTGCGT CACGAACGGT TTCCGTCTTC 



♦ ZKVTF DRL Q V L DSHY QDV LKE VKAA ASK 
5681 AAAGTCACAT TTGACAGACT GCAAGTTCTG GACAGCCATT ACCAGGACGT ACTCAAGGAG GTTAAAGCAG CGGCGTCAAA 
TTTCAGTGTA AACTGTCTGA CGTTCAAGAC CTGTCGGTAA TGGTCCTGCA TGAGTTCCTC CAATTTCGTC GCCGCAGTTT 



+ 2 VKA NLLS VEE ACS LTPP HSA KSK FGY 
5761 AGTGAAGGCT AACTTGCTAT CCGTAGAGGA AGCTtGCAGC CTGACGCCCC CACACTCAGC CAAATCCAAG TTTGGTTATG 
TCACTTCCGA TTGAACGATA GGCATCTCCT TCGAACGTCG GACTGCGGGG GTGTGAGTCG GTTTAGGTTC AAACCAATAC 



-^ZGAKD VRC HARK AVT HIN SVWK DLL EDN 
5841 GGGCAAAAGA CGTCCGTTGC CATGCCAGAA AGGCCGTAAC CCACATCAAC TCCGTGTGGA AAGACCTTCT GGAAGACAAT 
CCCGTTTTCT GCAGGCAACG GTACGGTCTT TCCGGCATTG GGTGTAGTTG A3GCACACCT TTCTGGAAGA CCTTCTGTTA 



*-2VTPI DTT IMA KNEV FCV QPE KGGR KPA 
5921 GTAACACCAA TAGACACTAC CATCATGGCT AAGAACGAGG TTTTCTGCGT TCAGCCTGAG AAGGGGGGTC GTAAGCCAGC 
CATTGTGGTT ATCTGTGATG 6TAGTACCGA TTCTTGCTCC AAAAGACGCA AGTCGGACTC TTCCCCCCAG CATTCGGTCG 
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+2 RLI VFPD LGV RVC EKMA LYO VVT KLP 
6001 TCGTCTCATC GTGTTCCCCG ATCTGGGCGT GCGCGTGTGC GAAAAGATGG CTTTGTACGA CGTGGTTACA AAGCTCCCCT 
AGCAGAGTAG CACAAGGGGC TAGACCCGCA CGCGCACACG CTTTTCTACC GAAACATGCT GCACCAATGT TTCGAGGGGA 



■*'2 


LAVM.GSS YGFQ YSP GQR VEFL VQA 

EcoRl 


W K S 


6081 


TGGCCGTGAT GGGAAGCTCC TACGGATTCC AATACTCACC AGGACAGCCG GTTGAATTCC TCGTGCAAGC 
ACCGGCACTA CCCTTCGAGG ATGCCTAAGG TTATGAGTGG TCCTGTCGCC CAACTTAAGG AGCACGTTCG 


GTGGAAGTCC 
CACCTTCAGG 


6161 


KKTP MGF SYD TRCF DST VTE SDIR T^E 
AAGAAAACCC CAATGGGGTT CTCGTATGAT ACCCGCTGCT TTGACTCCAC AGTCACTGAG AGCGACATCC GTACGGAGGA 
TTCTTTTCGG GTTACCCCAA GAGCATACTA TGGGCGACGA AACTGAGGTG rCAGTGACTC TCGCTGTAGG CATGCCTCCT 


♦2 
6241 


AIY QCCD LDP QAR VAIK SLT ERL 
GGCAATCTAC CAATGTTGTG ACCTCGACCC CCAAGCCCGC GTGGCCATCA AGTCCCTCAC CGAGAGGCTT 
CCGTTAGATG GTTACAACAC TGGAGCTGGG GGTTCGGGCG CACCGGTAGT TCAGGGAGTG GCTCTCCGAA 


Y V G 
TATGTTGGGG 
ATACAACCCC 


♦ 2 
6321 


GPLT NSR GENC GYR RCR ASGV LTT 
GCCCTCTTAC CAATTCAAGG GGGGAGAACT GCGGCIATCG CAGGTGCCGC GCGAGCGGCG TACTGACAAC 
CGGGAGAATG GTTAAGTTCC CCCCTCTTGA CGCCGATAGC GTCCACGGCG CXSCTCGCCGC ATGACTGTTG 


S C G 
TAGCTGTGGT 
ATCGACACCA 


♦2 
6401 


NTLT CYI KAR AACR AAG LQD CTML VCG 
AACACCCTCA CTTGCTACAT CAAGGCCCGG GCAGCCTGTC GAGCCCCAGG GCTCCAGGAC TGCACCATCC TCGTGTGTGG 
TTGTGGGAGT GAACGATGTA GTTCCGGGCC CGTCGGACAG CTCGGCGTCC CGAGGTCCTG ACGTGGTACG AGCACACACC 


6481 


DDL VVIC ESA GVQ EDAA SLR AFT 
CGACGACTTA GTCGTTATCT GTGAAAGCGC GGGGGTCCAG GAGGACGCGG CGAGCCTGAG AGCCTTCACG 
GCTGCTGAAT CAGCAATAGA CACTTTCGCG CCCCCAGGTC CTCCTGCGCC GCTCGGACTC TCGGAAGTGC 


E A M 
GAGGCTATGA 
CTCCGATACT 


>2 
6561 


TRYS APP GDPP QPE YDL ELIT S C S 
CCAGGTACTC CGCCCCCCCT GGGGACCCCC CACAACCAGA ATACGACTTG GAGCTCATAA CATCATGCTC 
GGTCCATGAG GCGGGGGGGA CCCCTGGGGG GTGTTGGTCT TATGCTGAAC CTCGAGTATT GTAGTACGAG 


S N V 
CTCCAACGTG 
GAGGTTGCAC 


♦2 
6641 


SVAH OGA GKR VYYL TRO PTT PLAR AAW 
TCAGtCGCCC ACGACGGCGC TGGAAAGAGG GTCTACTACC TCACCCGTGA CCCTACAACC CCCCTCGCGA GAGCTGCGTG 
AGTCAGCGGG TGCTGCCGCG ACCTTTCTCC CAGATGATGG AGTGGGCACT GGGATGTTGG GGGGAGCGCT CTCGACGCAC 


♦2 
6721 


ETA RHTP VNS MLG NIIM FAP TLW 
GGAGACAGCA AGACACACTC CAGTCAATTC CTGGCTAGGC AACATAATCA TGTTTGCCCC CACACTGTGG 
CCTCTGTCGT TCTGTGTGAG GTCAGTTAAG GACCGATCCG TTGTATTAGT ACAAACGGGG GTGTGACACC 


ARM 
GCGACGATGA 
CGCTCCTACT 


♦2 
6801 


ILMT HFF SVLI ARD QLE QALD CEI 
TACTGATGAC CCATTTCTTT AGCGTCCTTA TAGCCAGGGA CCAGCTTGAA CAGGCCCTCG ATTGCGAGAT 
ATGACTACTG 6GTAAAGAAA TCGCAGGAAT ATCG6TCCCT GGTCGAACTT GTCCGGGAGC TAACGCTCTA 


Y G A 
CTACGGGGCC 
GATGCCCCGG 


+2 
6881 


CYSI EPL OLP P IIQ RLH GLS AFSl 
TGCTACTCCA TAGAACCACT GGATCTACCT CCAATCATTC AAAGACTCCA TGGCCTCAGC GCATTTTCAC 
ACGATCAGCT ATCTTGGTGA CCTAGATGGA GGTrAGIAAG TTTCTGAGGT ACCGGAGTCG CGTAAAAGTG 


« H S Y 
TCCACAGTTA 
AGCTGTCAAT 


♦2 
6961 


SPG EIMR VAA CLR KLGV PPL RAW 
CTCTCCAGGT CAAATCAATA GGGTGGCCGC ATGCCTCAGA AAACTTGGGG TACCGCCCTT GC6AGCTTGG 
GAGAGGTCCA CTTTAGTTAT CCCACCGGCG TACGGAGTCT TTTGAACCCC ATGGCCGGAA CGCTCGAACC 


R H R 
AGACACCGGG 
TCTGTGGCCC 


+2 
7041 


ARSV RAR LLAR GGR AAI CGKY LFN 
CCXGGAGCGT CCGCGCTAGG CTTCTGGCCA GAGGACGCAG GGCTGCCATA TGTGGCAAGT ACCTC7TCAA 
GGGCCTCGCA GGCGCGATCC GAAGACCGGT CTCCTCCGTC CCGACGGTA7 ACACCGTTCA TGGAGAAGTT 


WAV 
CTGGGCAGTA 
GACCCGTCAT 
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+ 2RTKL KLT PXA AAGQ LOL SGH FTAG ySG 
7121 AGAACAAAGC TCAAACTCAC TCXAATAGCG GCCGCTGGCC AGCTGGACTT GTCCGGCTGC TTCACGGCTG GCTACAGCGG 
TCTTGrrTCC AGTTTGAGTG AGGTTATCGC CGGCGACCQG TCGACCTGAA CAGGCCGACC AAGTGOCGAC CGATGTCGCC 



+ 2 GDI Y HSV SHA RPR WIWF CLL LLA AGV 
7201 GGGAGACATT TATCACAGCG TGTCTCATGC CCGGCCCCGC TGGATCTGGT TTTGCCTACT CCTGCTTGCT GCAGCGGTAG 
CCCTCTGTAA ATAGTGTCGC ACAGAGTACG GGCCGGGGCG ACCTAGACCA AAACGGATGA GGACGAACGA CGTCCCCATC 



♦ 2GrYL LPN R 
7281 GCATCTACCT CCTCCCCAAC CGATGAAGGT TGGGGTAAAC ACTCCGGCCT AAAAAAAAAA AAAAATCTAG AAAGGCGCGC 
CGTAGATGGA GGAGGGGTTG GCTACTTCCA ACCCCATTTG TGAGGCCGGA TTTTTTTTTT TTTTTAGATC TTTCCGCGCG 



BamHI Mlul 



7361 CAAGATATCA AGCATCCACT ACGCGTTAGA GCTCGCTGAT CAGCCTCGAC TGTGCCTTCT AGTTGCCAGC CATCTGTTGT 
GTTCTATAGT TCCTAGGTGA TGCGCAATCT CGAGCGACTA GTCGGAGCTG ACACGGAAGA TCAACGGTCG GTAGACAACA 



7441 TTGCCCCTCC CCCGTGCCTT CCTTGACCCT GGAA6GTGCC ACTCCCACTG TCCTTTCCTA ATAAAATGAG GAAATTGCAT 
AACGGGGAGG GGGCACGGAA GGAACTGGGA CCTTCCACGG TGAGGGTGAC AGGAAAGGAT TATTTTACTC CTTTAACGTA 



7521 cGCATTCTcr GAGTAGGTcr catt<:Yattc TGGGGGGTGG GGTGGGGCAG GACAGCAAGG GGGAGGATTG GGAAGACAAT 
GCGTAACAGA CTCATCCACA GTAAGATAAG ACCCCCCACC CCACCCCGTC CTGTCGTTCC CCCTCCTAAC CCTTCTGTTA 



7601 AGCACCCATG* CTGGGGAGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC GCTCGGTCGT TCGGCTGCGG CGAGCGGTAT 
TCGTCCGTAC GACCCCTCGA GAAGGCGAAG GAGCGAGTGA CTGAGCGACG CGAGCCAGCA AGCCGACGCC GCTCGCCATA 



7681 CAGCTCACTC AAAGGCGGTA ATACGGTTAT CCACAGAATC AGGGGATAAC GCAGGAAAGA ACATGTGAGC AAAAGGCCAG 
GTCGAGTGAG TTTCCGCCAT TATGCCAATA GGTGTCTTAG TCCCCTATTG CGTCCTTTCT TGTACACTCG TTTTCCGGTC 



7761 CAAAAGGCCA GGAACCGTAA AAAGGCCGCG TTGCTGGCGT TTTTCCATAG GCTCCGCCCC CCTGACGAGC ATCACAAAAA 
GTTTTCCGGT CCTTGGCATT TTTCCGGCGC AACGACCGCA AAAAGGTATC CGAGGCGGGG GGACTGCTCG TAGTGTTTTT 



7841 TCGACGCTCA AGTCAGAGGT GGCGAAACCC GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC TCCCTCGTGC 
AGCTGCGAGT TCAGTCTCCA CCGCTTTGGG CTGTCCTGAT ATTTCTATGG TCCGCAAAGG GGGACCTTCG AGGGAGCACG 



7921 GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC CGCCTTTCTC CCTTCGGGAA GCGTGGCGCT TTCTCAATGC 
CGAGAGGACA AGGCTGGGAC GGCGAATGGC CTATGGACAG GCGGAAAGAG GGAAGCCCTT CGCACCGCGA AAGAGTTACG 



8001 TCACGCTGTA GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC GAACCCCCCG TTCAGCCCGA 
AGTGCGACAT CCATAGAGTC AAGCCACATC CAGCAAGCGA GGTTCGACCC GACACAC6TG CTTGGGGGGC AAGTCGGGCT 



8081 CCGCTGCGCC TTATCCGGTA ACTATCGTCT TGAGTCCAAC CCGGTAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 
GGCGACGCGG AATAGGCCAT TGATAGCAGA ACrCAGGTTG GGCCATTCTG TGCTGAATAG CGG7GACCGT CGTCGGTGAC 



8161 GTAACAGGAT TAGCAGAGCG AGGTATGTAG GCGGTGCTAC AGAGTTCTTG AAGTGGTGGC CTAACTACGG CTACACTAGA 
CATTGTCCTA ATCGTCTCGC TCCATACATC CGCCACGATG TCTCAAGAAC TTCACCACCG GATTGATGCC GATGTGATCT 



8241 AGGACAGTAT TTCGTATCTG CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT CCGGCAAACA 
TCCTGTCATA AACCATAGAC GCGAGACGAC TTCGGTCAAT GGAAGCCTTT TTCTCAACCA TCGAGAACTA GGCCGTTTGT 



8321 AACCACCGCT GGTAGCGGTG GTTTTTTTGT TTGCAAGCAG CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT 
TTGGTGGCGA CCATCGCCAC CAAAAAAACA AACGTTCGTC GTCTAATGCG CGTCTTTTTT TCCTAGAGTI CTTCTAGGAA 



8401 TGATCTTrrC TACGGGGTCT GACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG TCATGAGATT ATCAAAAAGG 
ACTAGAAAAG ATGCCCCAGA CTGCGAGTCA CCTTGCTTTT GAGTGCAATT CCCTAAAACC AGTACTCTAA TAGTTTTTCC 
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8481 


ATCTTCACCT AGATCCTTTT AAATTAAAAA TGAAGTTTTA AATCAATCTA AAGTATATAT GAGTAAACTT GGTCTGACAG 
TAGAAGTGGA TCTAGGAAAA TTTAATTTTT ACTTCAAAAT TTAGTTAGAT TTCATATATA CTCATTTGAA CCAGACTGTC 


8361 


TTACCAATGC TTAATCAGTG AGGCACCTAT CTCAGCGATC TGTCTATTTC GTTCATCCAT AGTTGCCTGA CTCCCCGTCG 
AATGGTTACG AATTAGTCAC TCCGTGGATA GAGTCGCTAG ACAGATAAAG CAAGTAGGTA TCAACGGACT GAGGGGCAGC 


8641 


TGTAGATAAC TACGATACGG GAGGGCTTAC CATCTGGCCC CAGTGCTGCA ATGATACCGC GAGACCCACG CTCACCGGCT 
ACATCTATTG ATGCTATGCC CTCCCGAATG GTAGACCGGG GTCACGACGT TACTATGCCG CTCTCGGTGC GAGTGGCCGA 


8721 


CCAGATTTAT 
GGTCTAAATA 


CAGCAATAAA CCAGCCAGCC 
GTCGTTATTT GGTC6GTCGG 


GGAAGGGCCG 
CCTTCCCGGC 


AGCGCAGAAG TGGTCCTGCA 
TCGCGTCTTC ACCAGGACGT 


ACTTTAtCCG CCTCCATCCA 
TGAAATAGGC GGAGGTAGGT 


8801 


GTCTATTAAT 
CAGATAATTA 


TGTTGCCGGG AAGCTAGAGT 
ACAACGGCCC TTCGATCTCA 


AAGTAGTTCG 
TTCATCAAGC 


CCAGTTAATA GTTTGCGCAA 
GGTCAATTAT CAAACGCGTT 


CGTTGTTGCC ATTGCTACAG 
GCAACAACGG TAACGATGTC 


8881 


GCATCGTGGT GTCACGCTCG TCGTTTGGTA 
CGTAGCACCA CAGTGCGAGC AGCAAACCAT 


TGGCTTCATT 
ACCGAAGTAA 


CAGCTCCGGT TCCCAACGAT 
GTCGAGGCCA AGGGTTGCTA 


CAAGGCGAGT TACATGATCC 
GTTCCGCTCA ATGTACTAGG 


3961 


CCCATGTTGT GCAAAAAAGC GGTTAGCTCC 
GGGTACAACA CGTTTTTTCG CCAATCGAGG 


TTCGGTCCTC 
AAGCCAGGAG 


CGATCGTTGT CAGAAGTAAG 
GCTAGCAACA GTCTTCATTC 


rrGGCCGCAG TGTTATCACT 
AACCGGCGTC ACAATAGTGA 


9041 


CATGGTTATG 
GTACCAATAC 


GCAGCACTGC ATAATTCTCT 
CGTCGTGACG TATTAAGAGA 


TACTGTCATG 
ATGACAGTAC 


CCATCCGTAA GATGCTTTTC 
GGTAGGCATT CTACGAAAAG 


TGTGACTGGT GAGTACtCAA 
ACACTGACCA CTCATGAGTT 


9121 


CCAAGTCATT 
GGTTCAGTAA 


CTGAGAATAG TGTATGCGGC 

GACTCTTATC ACATACGCCG 


GACCGAGTTG 
CTGGCTCAAC 


CTCTTGCCCG GCGTCAATAC 
GAGAACGGGC CGCAGTTATG 


GGGATAATAC CGCGCCACAT 
CCCTATTATG GCGCGGTGTA 


9201 


AGCAGAACTT 
TCGTCTTGAA 


TAAAAGTGCT CATCATTGGA 
ATTTTCACGA GTAGTAACCT 


AAACGTTCTT 
TTTGCAAGAA 


CGGGGCGAAA ACTCTCAAGG 
GCCCCGCTTT TGAGAGTTCC 


ATCTTACCGC TGTTGAGATC 
TAGAATGGCG ACAACTCTAG 


9281 


CAGTTCGATG 
GTCAAGCTAC 


TAACCCACTC GTGCACCCAA 
ATTGGGTGAG CACGTGGGrT 


CTGATCTTCA 

GACTAGAAGT 


GCATCTTTTA CTTTCACCAG 
CGTAGAAAAT GAAAGTGGTC 


CGTTTCTGGG TGAGCAAAAA 

GCAAAGACCC ACTCGTTTTT 


9361 


CAGGAAGGCA 
GTCCTTCCGT 


AAATGCCGCA AAAAAGGGAA 
TTTACGGCGT ?TTTTCCCTT 


TAAGGGCGAC 
ATTCCCGCTG 


ACGGAAATGT TGAATACTCA 
TGCCTTTACA ACTTATGAGT 


TACTCTTCCT TTTTCAATAT 
ATGAGAAGGA AftAAGTTATA 


9441 


TATTGAAGCA 
ATAACTTCGT 


TTTATCAGGG TTATTGTCTC 
AAATAGTCXC AATAACAGAG 


ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC AAATAGGGGT 
TACTCGCCTA TGTATAAACT TACATAAATC TTTTTATTTG TTTATCCCCA 


9521 


TCCGCGCACA 
AGGCGC6TGT 


TTTCCCCGAA AAGTGCCACC 
AAAGGGGCTT TTCACGGTGG 


TGACGTCTAA 
ACTGCAGATT 


GAAACCATTA TTATCATGAC 
CTTTGCTAAT AATAGTACTG 


ATTAACCTAT AAAAATAGGC 
TAATTGGATA TTTTTATCCG 


9601 


GTATCACGAG 
CATAGTGCTC 


GCCCTTTCGT C 
CGGGAAAGCA G 
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I 


TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCXS GA6ACGGTCA CAGCTTGrCT GTAAGCGGAT 
AGCGCGCAAA GCCACTACTG CCACTTTTGG AGACT6TGTA CGTCGAGGGC CTCTGCCAGT GTCGAACAGA CATTCGCCTA 


81 


GCCGGGAGCA GACAAGCCCC TCACGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA 
CGGCCCTCGT CTGTTCGCGC AGTCCCGCGC AGTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC GCCGTAGTCT 


161 


GCAGATTGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG 
CGTCTAACAT GACTCTCACG TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GATGAAGACC 


241 


AATAGCTCAG AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG AATGGGCGGA 
TTATCGAGTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT ACCCCGCCTC TTACCCGCCT 


321 


ACTGGGCGGG QAGGGAATTA TTGGCTATTG GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT TATATTGGCT 
TGACCCGCCC CTCCCTTAAT AACCCATAAC CGGTAACGTA TGCAACATAG ATATAGTATT ATACATGTAA ATATAACCGA 


401 


CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA TAGTAATCAA TTACGGGG7C ATTAGTTCAT 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT ATCATTAGTT AATGCCCCAG TAATCAAGTA 


481 


AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACX3GTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT 
TCGGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGCC GGGTTGCTGG GGGCGGGTAA 


561 


GACGTCAATA ATGACGTATG tTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA 


641 


AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCCCCC CCTATTGACG TCAATGACGG TAAATGGCCC 
TTTGACGGGT GAACCCTCAT CTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC AGTTACTGCC ATTTACCGGG 


721 


GCCTCGCATT ATGCCCAGTA CATGACCTTA CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 
CGGACCGTAA TACGGGTCAT GTACTC6AAT GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG 


801 


CATGGTGATG CGGTTTTOSC ACTACACCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAACT CTCCACCCCA 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCC6CACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA GAGGTGGGGT 


881 


TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ATAACCCCGC CCCGTTGACG 
AACTGCAGTT ACCCTCAAAC AAAACC6TGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TATTGGGGCG GGGCAACTGC 


961 


CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG 
GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGAGC AAATCACTTG GCAGTCTAGC GGACCTCTGC 


1041 


CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCXAGCC TCCGCGGCCG GGAACGGTGC ATTGGAACGC 
GCTAGGTGCG ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC CCTTGCCACG TAACCTTGCG 


1121 


GGATTCCCCG TGCCAAGAGT GACGTAAGTA CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 
CCTAAGGCGC ACGGTTCTCA CTGCATTCAT GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGAGAATA CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA TGGTATAGCT TAGCCTATAG GTGTGGGTTA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT ACCATATCGA ATCGGATATC CACACCCAAT 


1281 


TTGACCATTA TTGACCACTC CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TCGCTCTTTC COlCAACTAT 
AACTGGTAAT AACTGGTGAG GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATTGT ACCGAGAAAC GGTGTTCATA 


1361 


CTCrATTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT GACACGGACT CTGTATTTTT ACAGGATGGG GTCCATTTAT 
GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA CTGTGCCTGA GACATAAAAA TGTCCTACCC CAGGTAAATA 


1441 


TATTTACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA TAGCGTGGGA JCTCCGACAT 
ATAAATGTTT AAGTGTATAT GTTGTTGCGG CACGGGGCAC GGGCGTCAAA AATAATTTGT ATCGCACCCT AGAGGCTGTA 
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1521 


CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCTT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA GAG<X:CATCG CCGCCTCGAA GGTCTAGGCT CGGGACCAGG GTAGGCAGGT 


1601 


GCGGCTCATG CTCGCTCGCC ACCTCCTTGC TCCTAACAGT GGAGGCCAGA CTTAGGCACA GCACAATGCC CACCACCACC 
CGCCGAGTAC CAGCCAGCCC TCGAGGAACG AGGATTGTCA CCTCCGGTCT GAATCCGTGT CGTGTTACGG GTGGTGGTGG 


16B1 


AGTGTGCCGC ACAAGGCCGT GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT GGACGCAGAT 
TCACACGGCG TGTTCCGGCA CCGCCATCCC ATACACAGAC TTTTACTCGA GCCTCTAACC CGAGCGTGGA CCTGCGTCTA 


L lOX. 


CC&AC&rTfA AGGCKorGGC AGAAGAAGAT GCAGGCAGCT GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCCTCGA CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA 


1841 


TGCGGTGC7G TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG CGCCACCAGA CATAATAGCT 
ACGCCACGAC AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC GCGGTGGTCT GTATTATCGA 




EcoRI 

rararnrTaA rartftrrftTTr rTTTrcATGG GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCAGA CTCGAGCAAG 
CTGTCTGATT GTCTGACAAG GAAAGGTACC CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTCT gagctcgttc 






2001 


tctagaaagg cgcgccaaga tatcaaggat ccactacgcg ttagagctcg ctgatcagcc tcgactgtgc cttctagttg 
agatctttcc gcgcggttct atagttccta ggtgatgcgc aatctcgagc gactagtcgg agctgacacg gaagatcaac 


2081 


ccagccatct gttgtttgcc cctcccccgt gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa 
ggtcggtaga caacaaacgg ggagggggca cggaaggaac tgggaccttc cacggtgagg gtgacaggaa aggattattt 


2161 


ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTGGGG GGTGGGGTGG GGCAGGACAG CAAGGGGGAG 
TACTCCTTTA ACGTAGCGTA ACAGACTCAT CCACAGTAAG ATAAGACCCC CCACCCCACC CCGTCCTGTC GTTCCCCCTC 


2241 


GATTGCGAAG ACAATAGCAG GCATGCTGGG GAGCTCTTCC GCTTCCTCGC TCACTGACTC GCTGCGCTCG GTCGTTCGGC 
CTAACCCTTC TGTTATCGTC CGTACGACCC CTCGAGAAGG CGAAGGAGCG AGTGACTGAG CGACGCGAGC CAGCAAGCCG 


2321 


TGCGGCGAGC GGTATCAGCT CACTCAAAGG CGGTAATACG GTTATCCACA GAATCAGGGG ATAACGCAGG AAAGAACATG 
ACGCCGCTCG CCATAGTCGA GTGAGTTTCC GCCATTATGC CAATAGGTCT CTTACTCCCC TATTGCGTCC TTTCTTGTAC 


2401 


TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT GGCGTTTTTC CATAGGCTCC GCCCCCCTGA 
ACTCGTTTTC CGGTCGTTTT CCGGTCCTTG GCATTTTTCC GGCGCAACGA CCQCAAAAAG GTATCCGAGG CGGGGGGACT 


2481 


CGAGCATCAC AAAAATCCAC GCTCAAGTCA GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGCCG TTTCCCCCTG 
GCTCCTAGTG TTTTTAGCTG CGAGTTCAGT CTCCACCGCT TTGGGCTGTC CTGATATTTC TATGGTCCGC AAACCGGGAC 


2S61 


GAAGCTCCCT CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT TTCTCCCTTC GGGAAGCGTG 
CTTCGAGGGA GCACGCGAGA GGACAAGGCT GGGACGGCGA ATG6CCTATG GACAGGCGGA AAGAGGGAAG CCCTTCGCAC 


2641 


GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT TCGCTCCAAG CTGGGCTGTG TGCACGAACC 
CGCGAAAGAG TTACGAGTGC GACATCCATA GAGTCAAGCC ACATCCAGCA AGCGAGGTTC GACCCGACAC ACGTGCTTGG 


2721 


CCCCGTTCAG CCCGACCGCT CCGCCTTATC CGGTAACTAT CGTCTT6AGT CCAACCCGGT AAGACACGAC TTATCGCCAC 
GGGGCAAGTC GGGCTGGCGA CGCGGAATAG GCCATTGATA GCAGAACTCA GGTTGGGCCA TTCT6TGCTG AATAGCGGTG 


2801 


TGGCAGCAGC CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG GTGGCCTAAC 
ACCGTCGTCG GTGACCATTG TCCTAATCGT CTCGCTCCAT ACATCCGCCA CGATGTCTCA AGAACTTCAC CACCGGATTG 


2881 


TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC TGCTGAAGCC AGTTACCTTC GGAAAA^ HS^TS^^ST^ 
ATGCCGATGT GATCTTCCTC TCATAAACCA TACACCCCAG ACGACTTCGG TCAATGGAAG CCTTTTTCTC AACCATCGAG 
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2961 TTGATCCGGC AAACAAACCA CCGCTGGTAG CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT 
AACTAGGCCG TTTGTTTGGT GGCGACCATC GCCACCAAAA AAACAAACGT TCGTCGTCTA ATGCGCGTCT TTTTTTCCTA 

3041 CTCAAGAAGA TCCTTTGATC TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT TTTGGTCATG 
GAGTTCTTCT AGGAAACTAG AAAAGATGCC CCAGACTGCG AGTCACCTTG CTTTTGAGTG CAATTCCCTA AAACCAGTAC 

3121 AGAITATCAA AAAGGATCTT CACCTAGATC CTTTTAAATT AAAAATGAAG TTTTAAATCA ATCTAAAGTA TATATGAGTA 
TCTAATAGTT TTTCCTAGAA GTGGATCTAG GAAAATTTAA TTTTTACTTC AAAATTTAGT TAGATTTCAT ATATACTCAT 

3201 AACTTGGTCT GACAGTTACC AATGCTTAAT CAGTGAGGCA CCTATCTCAG CGATCTGTCT ATTTCGTTCA TCCATAGTTG 
TTGAACCAGA CTGTCAATGG TTACGAATTA GTCACTCCGT GGATAGAGTC GCTAGACAGA TAAAGCAAGT AGGTATCAAC 



3281 CCTGACTCCC CGTCGTGTAG ATAACTACGA TACGGGAGGG CTTACCATCT GGCCCCAGTG CTGCAATGAT ACCGCGAGAC 
GGACTGAGGG GCAGCACATC TATTGATGCT ATGCCCTCCC GAATGGTAGA CCGGGGTCAC GACGTTACTA TGGCGCTCTG 

3361 CCACGCTCAC CGGCTCCAGA TTTATCAGCA ATAAACCAGC CAGCCGGAAG GGCCGAGCGC AGAAGTGGTC CTGCAACTTT 
GGTGCGAGTG GCCGAGGTCT AAATAGTCGT TATTTGGTCG GTCGGCCTTC CCGGCTCGCG TCTTCACCAG GACGTTGAAA 

3441 ATCCGCCTCC ATCCAGTCTA TTAATTGTTG CCGGGAAGCT AGAGTAAGTA GrtCGCCAGT TAATAGTTTG CGCAACGTTG 
TAGGCGGAGG TAGGTCAGAT AATTAACAAC GGCCCTTCGA TCTCATTCAT CAAGCGGTCA ATTATCAAAC GCGTTGCAAC 

3321 TTCCCATTGC TACAGGCATC GTGGTGTCAC GCTCGTCGTT TGGTATGGCT TCATTCAGCT CCGGTTCCCA ACCATCAAGG 
AACGGTAACG ATGTCCGTAG CACCACA6TG CGAGCAGCAA ACCATACCGA AGTAAGTCGA GGCCAAGGGT TCCTACTTCC 

3601 CGAGTTACAT GATCCCXTCAT GTTGTGCAAA AAAGCGGTTA GCTCCTTCGG TCCTCCGATC GTTGTCAGAA GTAAGTTGGC 
GCTCAATGTA CTAGGGGGTA CAACACGTTT TTTCGCCAAT CGAGGAAGCC AGGAGGCTAG CAACAGTCTT CATTCAACCG 

36B1 CGCAGTGTTA TCACTCATGG TTATGGCAGC ACTGCATAAT TCTCTTACTG TCATCCCATC CGTAAGATGC TTTTCTGTGA 
GCGTCACAAT AGTGAGTACC AATACCGTCG TGACGTATTA AGAGAATGAC AGTACGGTAG GCATTCTACG AAAAGACACT 

3761 CTGGTGAGTA CTCAACCAAG TCATTCTGAG AATAGTGTAT GCGGCGACCG AGTTGCTCTT GCCCCGCCTC AATACGGGAT 
GACCACTCAT GAGTTGGTTC AGTAAGACTC TTATCACATA CGCCGCTCCC TCAACGAGAA CGGGCCGCAG TTATGCCCTA 



3841 AATACCGCGC CACATAGCAG AACTTTAAAA GTGCTCATCA TTGGAAAACG TTCTTCGGGG CGAAAACTCT CAAGGATCTT 
TTATGGCGCG GTGTATCGTC TTGAAATTTT CACGAGTAGT AACCTTTTGC AAGAAGCCCC GCTTTTGAGA GTTCCTAGAA 

3921 ACCGCTGTTG AGATCCAGTT CGATGTAACC CACTCGTGCA CCCAACTGAT CTTCAGCATC TTTTACTTTC ACCAGCGTTT 
TGGCGACAAC TCTAGGTCAA GCTACATTGG GTGAGCACGT GGGTTGACTA GAAGTCGTAG AAAATGAAAG TGGTCGCAAA 



4001 CTGGGTGAGC AAAAACAGGA AGGCAAAATG CCGCAAAAAA GGGAATAAGG CCCACACGGA AATGTTGAAT ACTCATACTC 
GACCCACTCG TTTTTGTCCT TCCGTTTTAC GGCGTTTTTT CCCTTATTCC CGCTGTGCCT TTACAACTTA TGAGTATGAG 



4081 TTCCTtTTTC AATATTATTG AAGCATTTAT CAGGGTTATT GTCTCATGAG CGGATACATA TTT6AATGTA TTTAGAJJUJ^ 
AAGGAAAAAG TTATAATAAC TTCGTAAATA GTCCCAATAA CAGAGTACTC GCCTAT GTAT AAACTTACAT AAATCTTTTT 

4161 TAAACAAATA GGGGTTCCGC GCACATTTCC CCGAAAAGTG CCACCTGACG JCTJAGAAAC CATTATtATC ^GACATTAA 
ATTTGTTTAT CCCCAAGGCG CGIGTAAAGG GGCTTTTCAC GGTGGACTGC AGATTCTTTG GTAATAATAG TACTGTAATT 



4241 CCTATAAAAA TAGGCGTATC ACGAGGCCCT TTCGTC 
GGATATTTTT ATCCGCATA6 TGCTCCGGGA AAGCAG 
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TCGCGCGTTT 
AGCGCGCAAA 


CGCTGATGAC 
GCCACTACTG 


GGTGAAAACC 
CCACTTTTGG 


TCTGACACAT 
AGACrCTGTA 


GCAGCTCCCG 
CGTCGAGGGC 


51 


GAGACGGTCA 
CTCTGCCAGT 


CAGCTTGTCT 
GTCGAACAGA 


GTAAGCGGAT 
CATTCGCCTA 


GCC5GGAGCA 
CGGCCCTCGT 


GACAAGCCC3 
CTGTTCGGGC 


101 


TCAGGGCGCG 
AGTCCCGCGC 


TCAGCGGGTG 
AGTCGCCCAC 


TTGGCGGGTG 
AACCGCCXAC 


rCGGGGCTGG 
AGCCCCGACC 


CTTAACTATG 
GAATTGATAC 


151 


CGGCATCAGA 
GCCGTAGTCT 


GCAGATTGTA 
CGTCTAACAT 


CTGAGAGTGC 
GACTCTCACG 


ACCATATGAA 
TGGTATACTT 


GCTTTTTGCA 
CGAAAAACGT 




StuI 








201 


AAAGCCTAGG 
TTTCGGATCC 


CCTCCAAAAA 
GGAGGTTTTT 


AGCCTCCTCA 
TCGGAGGAGT 


CTACTTCTGG 
GATGAAGACC 


AATAGCTCAG 
TTATCGAGTC 


251 


AGGCCGAGGC 
TCCGGCTCCG 


GGCCTCGGCC 
CCGGAGCCGC 


TCTGCATAAA 
AGACGTATTT 


TAAAAAAAAT 
ATTPTTTTTA 


TAGTCAGCCA 
ATCAGTCGGT 


301 


TGGGGCGGAG 
ACCCCGCCTC 


AATGGGCGGA 
TTACCCGCCT 


ACTGGGCGGG 
TCACCCGCCC 


GAGGGAATTA 
CrCCCTTAAT 


TTGGCTATTG 
AACCGATAAC 


351 


GCCATTGCAT 
CGGTAACGTA 


ACGTTGTATC 
TGCAACATAG 


TATATCATAA 
ATATAGTATT 


TATGTACATT 
ATACATGTAA 


TATATTGGCT 
ATATAACCGA 


401 


CATGTCCAAT 
GTACAGGTTA 


ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 
TACTGCCGGT ACAACTGTAA CTAATAACTG ATCAATAATT 


451 


TAGTAATCAA 
ATCATTAGTT 


TTACGGGGTC ATTAGTTCAT AGCCCATATA 
AATGCCCCAG TAATCAAGTA TCGGGTATAT 


TGGAGTTCCG 
ACCTCAAG6C 


501 


CGTTACATAA 
GCAATGTATT 


CTTACGGTAA 
GAATGCCATT 


ATGGCCCGCC 
TACCGGGCGG 


TGGCTGACCG 
ACCGAC7GGC 


CCCAACGACC 
GGGTTGCT6G 


551 


CCCGCCCATT 
GGGCGGGTAA 


GACGTCAATA 
CTCCAGTTAT 


ATGAC6TATG 
TACTGCATAC 


TTCCCATAGT 
AAGCKiTATCA 


AACGCCAATA 
TTGCGGTTAT 


601 


GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA 
CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA TTTGACGGGT 


651 


CTTGGCAGTA 
GAACCGTCA7 


CATCAAGTGT 
GTAGTTCACA 


ATCATATGCC 
TAGTATACGG 


AAGTCCGCCC 
TTCAGGCGGG 


CCTATTGACG 
GGATAACTGC 


701 


TCAATGACGG 
AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


GCCTGGCATT 
CGGACCGTAA 


ATGCCCAGTA 
TACGGGTCAT 


CATGACCTTA 
GTACTGGAAT 


■751 


CGGGACTTTC 
GCCCTGAAAG 


CTACTTGGCA 
GATGAACCGT 


GTACATCTAC 
CATGTAGATG 


GTATTAGTCA 
CATAATCAGT 


TCGCTATTAC 
AGCGATAATG 


801 


CATGGTGATG 
GTACCACTAC 


CGGTTTTGGC 
GCCAAAACCG 


AGTACACCAA 
TCATGTGCTT 


TGGGCGTGGA 
ACCCGCACCT 


TAGCGGTTTG 
ATCGCCAAAC 


851 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT 
TAAAGGTTCA 


CTCCACCCCA TTGACGTCAA TGGGAGTTrG 
GAGCTGGGGT AACTGCAGTT ACCCTCAAAC 



25/100 



wo 01/58360 



PCT/USOO/32326 



PCMV-NS34A 

FIGURE 9 -Page 2 



901 


TTTTGGCACC 
AAAACCGT6G 


AAAATCAACG 
TTTTAGTTGC 


GGACTTTCCA 
CCTGAAAGGT 


AAATGTCGTA 
TTTACAGCAT 


ATAACCCCGC 
TATTGGGGCG 


951 


CCCGTTGACG 
GGGCAACTGC 


CAAATGGGCG 
GTTTACCCGC 


GTAGGCGTGT 
CATCCGCACA 


ACGGTGGGAG 
TGCCACCCTC 


GTCTATATAA 
CAGATATArr 


1001 


GCAGAGCTCG TTTAGTGAAC CGTCA6ATCG CCTGGAGACC CCATCCACGC 
CGTCTCGAGC AAATCACTTG GCAGTCTAGC GGACCTCTGC GGTAGGTGCG 


1051 


TGTTTTGACC 
ACAAAACTGG 


TCCATAGAAG 
AGGTATCTTC 


ACACCGGGAC 
TGTGGCCCTG 


CGATCCAGCC 
GCTAGGTCGG 


TCCGCGGCCG 
AGGCGCCGGC 


1101 


GGAACGGTGC 
CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


GGATTCCCCG 
CCTAAGGGGC 


TGCCAAGAGT 
ACGGTTCTCA 


GACGTAAGTA 
CTGCATTCAT 


1151 


CCGCCTATAG 
GGCGGATATC 


ACTCTATAGG 
TGAGATATCC 


CACACCCCTT 
GTGTGGGGAA 


TGGCTCTTAT 
ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC 


CTATAGGTGA 
GATATCCACT 


1251 


TGGTATAGCT 
ACCATATCGA 


TAGCCTATAG 
ATCGGATATC 


GTGTGGGTTA 
CACACCCAAT 


TTGACCATTA 
AACTGGTAAT 


TTGACCACTC 
AACTGGTGAG 


1301 


CCCTATTGGT 
GGGATAACCA. 


GACGATACTT 
CTGCTATGAA 


TCCATTACTA 
AGGTAATGAT 


ATCCATAACA 
TAGGTATTGT 


TGGCTCTTTG 
ACCGAGAAAC 


1351 


CCACAACTAT 
GGTGTTGATA 


CTCTATTGGC 
GAGATAACCG 


TATATGCCAA 
ATATACGGTT 


TACTCTGTCC 
ATGAGACAGG 


TTCAGAGACT 
AAGTCTCTGA 


1401 


GACACGGACT CTGTATTTTT 
CTGT6CCTGA GACATAAAAA 


ACA6GATGGG 
TGTCCTACCC 


GTCCATTTAT 
CAGGTAAATA 


TATTTACAAA 
ATAAATGTTT 


1451 


TTCACATATA 
AAGTGTATAT 


CAACAACGCC 
GTTGTTGCGG 


GTCCCCCGTG 
CAGGGGGCAC 


CCCGCAGTTT 
GGGCGTCAAA 


TTATTAAACA 
AATAATTTGT 


1501 


TAGCGTGGGA 
ATCGCACCCT 


TCTCCGACAT 
AGAGGCTGTA 


CTCGGGTACG 
GAGCCCATGC 


TGTTCCGGAC 
ACAAGGCCTG 


ATGGGCTCTT 
TACCCGAGAA 


1551 


CTCCGGTAGC 
GAGGCCATCG 


GGCGGAGCTT 
CCGCCTCGAA 


CCACATCCGA 
GGTGTAGGCT 


GCCCTGGTCC 
CGGGACCAGG 


CATCCGTCCA 
GTAGGCA6GT 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCA6A 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT 


1651 


CTTAGGCACA 
GAATCCGTGT 


6CACAATGCC 
CGTGTTACGG 


CACCACCACC 
GTGGTGGTGG 


AGTGTGCCGC 
TCACACGGCG 


ACAAGGCCGT 
TGTTCCGGCA 


1701 


GGCGGTAGGG 
CCGCCATCCC 


TATGTGTCTG 
ATACACAGAC 


AAAATGAGCT 
TTTTACTCGA 


CGGAGATTGG 
GCCTCTAACC 


GCTCGCACCT 
CGAGCGTGGA 


1751 


GGACGCAGAT 
CCTGCGTCTA 


GGAAGACTTA AGGCAGCGGC AGAAGAAGAT 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA 


GCAGGCAGCT 
CGTCCGTCGA 


IBOl 


GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCA6TCTCC ATTGAGGGCA ACGCCACGAC 
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X851 TTAACGCTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 



♦2 MAP 

EcoRI 

19S1 GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCACC ATGGCGCCCA 
CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTGG TACCGCGGGT 



+2ITAy AQQ TRGL LGC IIT 
2001 TCACGGCGTA CGCCCAGCAG ACAAGGGGCC TCCTAGGGTG CATAATCACC 
AGTGCCGCAT GCGGGTCGTC TGTTCCCCGG AGGATCCCAC GTATTAGTGG 



-2SLTG RDK NQV EGEV QIV 
2051 AGCCTAACTG GCCGGGACAA AAACCAAGTG GAGGGTGAGG TCCAGATTGT 
7CGGATTGAC CGGCCCTGTT TT7GGTTCAC CTCCCACTCC AGGTCTAACA 



+ 2 STA AQTF LAT CIN GVC 
2101 GTCAACTGCT GCCCAAACCT TCCTGGCAAC GTGCATCAAT GGGGTGTGCT 
CAGTTGACGA CGGGTTTGGA AGGACCGTTG CACGTAGTTA CCCCACACGA 



*2WTVY HGA GTRT IAS PKG 
2151 GGACTGTCTA ■ CCACGGGGCC GGAACGAGGA CCATCGCGTC ACCCAAGGGT 
CCTGACAGAT GGTGCCCCGG CCTTGCTCCT GCTAGCGCAG TGGGTTCCCA 



*2PVIQ MYT NVD QDLV GWP 
2201 CCTGTCATCC AGATGTATAC CAATGTAGAC CAAGACCTTG TGGGCTGGCC 
GGACAGTAGG TCTACATATG GTTACATCTG GTTCTGGAAC ACCCGACCGG 



♦2 ASO GTRS LTP CTC GSS 
2251 CGCTTCGCAA GGTACCCGCT CATTGACACC CTGCACTTGC GGCTCCTCGG 
GCGAAGCGTT CCATGGGCGA GTAACTGTGG GACGTGAACG CCGAGGAGCC 



-^2DLyL VTR HADV IPV RRK 
2301 ACCTTTACCT GGTCACGAGG CACGCCGATG TCATTCCCGT GCGCCGGCGG 
TGGAAATGGA CCAGTGCTCC GTGCGGCTAC AGTAAGGGCA CGCGGCCGCC 



-^2GDSR GSL LSP RPIS YLK 
2351 GGTGATAGCA GGGGCAGCCT GCTGTCGCCC CGGCCCATTT CCTACTTa\A 
CCACTATCGT CCCCGTCGGA CGACAGCGGG GCCGGGTAAA GGATGAACTT 



+2 GSS GGPL LCP AGH AVG 
24 01 AGGCTCCTCG GGGGGTCCGC TGTTGTGCCC CGCGGGGCAC GCCGTGGGCA 
TCCGAGGAGC CCCCCAGGCG ACAACACGGG GCGCCCCGTG CGGCACCCGT 



♦ 2IFRA AVC TRGV AKA VDF 
24 51 TATTTAGGGC CGCGGTGTGC ACCCGTGGAG TGGCTAAGGC GGTGGACTTT 
ATAAATCCCG GCGCCACACG TGG6CACCTC ACCGATTCCG CCACCTGAAA 



+2IPVE NLE TTM RSPV FTO 
2501 ATCCCTGTGG AGAACCTAGA GACAACCATG AGGTCCCCGG TGTTCACGGA 
TAGGGACACC TCTTGGATCT CTGTTGGTAC TCCA6GGGCC ACAAGTGCCT 
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+ 2 NSS'PPVV PQS FQV AHL 
2551 TAACTCCTCT CCACCAGTAG TGCCCCAGAG CTTCCAGGTG GCTCACCTCC 
ATTGAGGAGA GGTGGTCATC ACGGGGTCTC GAAGGTCCAC CGAGTGGAGG 



+2HAPT GSG KSTK VPA AYA 
2601 ATGCTCCCAC AGGCAGCGGC AAAAGCACCA AGGTCCCCGC TGCATATGCA 
TACGAGGGTG TCCGTCGCCG TTTTCGTGGT TCCAGGGCCG ACGTATACGT 



+2AQG.Y KVL VLN PSVA ATL. 
2651 GCTCAGGGCT ATAAGGTGCT AGTACTCAAC CCCTCTGTTG CTGCAACACT 
CGAGTCCCGA TATTCCACGA TCATGAGTTG GGGAGACAAC GACGTTGTGA 



+ 2 GFG AYMS KAH GID PNI 
2701 GGGCTTTGGT GCTTACATGT CCAAGGCTCA TGG6ATCGAT CCTAACATCA 
CCCGAAACCA CGAATGTACA G6TTCCGAGT ACCCTAGCTA GGATTGTAGT 



+ 2RTGV RTI TTGS PIT YST 
2751 GGACCGGGGT GAGAACAATT ACCACTGGCA GCCCCATCAC GTACTCCACC 
CCTGGCCCCA CTCTTGTTAA TGGTGACCGT CGGGGTAGTG CATGAGGTGG 



+ 2YGKF LAD GGC SGGA YDI 
2B01 TACGGCAAGT TCCTTGCCGA CGGCGGGTGC TCGGGGGGCG CTTATGACAT 
ATGCCGTTCA AGGAACGGCT GCCGCCCACG AGCCCCCCGC GAATACTGTA 



+ 2 lie DECH STD ATS ILG 
2851 AATAATTTGT GACGAGTGCC ACTCCACGGA TGCCACATCC ATCTTGGGCA 
TTATTAAACA CTGCTCACGG TGAGGTGCCT ACGGTGTAGG TAGAACCCGT 



+ 2.1 GTV LDQ AETA GAR LVV 
2901 TTGGCACTGT CCTTGACCAA GCAGAGACTG CGGGGGCGAG ACTGGTTGTG 
AACCGTGACA G6AACTGGTT CGTCTCTGAC GCCCCCGCTC TGACCAACAC 



+2 LATA TPP GSV TVPH PNI 
2951 CTCGCCACCG CCACCCCTCC GG6CTCCGTC ACTGTGCCCC ATCCCAACAT 
GAGCGGTGGC GGTGGGGAGG CCCGAGGCAG TGACACGGGG TAGGGTTGTA 



♦2 EEV ALST TGE IPF YGK 
30O1 CGAGGAGGTT GCTCTGTCCA CCACCGGAGA GATCCCTTTT TACGGCAAGG 
GCTCCTCCAA CGAGACAGGT GGTGGCCTCT CTAGGGAAAA ATGCCGTTCC 



+2AIPL EVI KGGR HLI FCH 
3051 CTATCCCCCT CGAAGTAATC AAGGGGGGGA GACATCTCAT CTTCTGTCAT 
GATAGGGGGA GCTTCATTAG TTCCCCCCCT CTGTAGAGTA GAAGACAGTA 



+2SKKK CDE LAA KLVA LGI 
3101 TCAAAGAAGA AGTGCGACGA ACTCGCCGCA AAGCTGGTCG CATTGGGCAT 
AGTTTCTTCT TCACGCT6CT TGAGCGGCGT TTCGACCAGC GTAACCCGTA 



+2 NAV AYYR GLD VSV IPT 
3151 CAATGCCGTG GCCTACTACC GCGGTCTTGA CGTGTCCGTC ATCCCGACCA 
GTTACGGCAC CGGATGATGG CGCCAGAACT GCACAGGCAG TAGGGCTGGT 



+2SGDV VVV ATOA LMT 6YT 
3201 GCGGCGATGT TGTCGTCGTG GCAACCGATG CCCTCATGAC CGGCTATACC 
CGCCGCTACA ACAGCAGCAC CGTTGGCTAC GGGAGTACTG GCCGATATGG 
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-t'2GDFD SVI DCN TCVT QTV 
3251 GGCGACTTCG ACTCGGT6AT AGACTGCAAT ACGTGTGTCA CCCAGACAGT 
CCGCTGAAGC TGAGCCACTA TCT6ACGTTA TGCACACAGT GGGTCTGTCA 



♦ 2 DFS LDPT FTI ETI TLP 
3301 CGATTTCAGC CTTGACCCXA CCTTCACCAT TGAGACAATC ACGCTCCCCC 
GCTAAAGTCG GAACTGGGAT GGAAGTGGTA ACTCTGTTAG TGCGAGGGGG 



+ 2QDAV SRT QRRG RTG RGK 
3351 AAGATGCTGT CTCCCGCACT CAACGTCGGG GCAGGACTGG CAGGGGGAAG 
TTCTACGACA GAGGGCGTGA GTTGCAGCCC CGTCCTGACC GTCCCCCTTC 



+ 2PGIY RFV APG ERPS GMF 
3401 CCAGGCATCT ACAGATTTGT GGCACCGGGG GAGCGCCCCT CCGGCATGTT 
GGTCCGTAGA TGTCTAAACA CCGTGGCCCC CTCGCGGGGA GGCCGTACAA 



+ 2 DSS VLCE CYD AGC AWY 
3451 CGACTCGTCC GTCCTCTGTG AGTGCTATGA CGCAGGCTGT GCTTGGTATG 
GCTGAGCAGG CAGGAGACAC TCACGATACT GCGTCCGACA CGAACCATAC 



+ 2ELTP AET TVRL RAY MNT 
3501 AGCTCACGCC CGCCGAGACT ACAGTTAGGC TACGAGCGTA CATGAACACC 
TCGAGTGCGG GCGGCTCTGA TGTCAATCC6 ATGCTCGCAT GTACTTGTGG 



+ 2PGLP VCQ DHL EFWE GVF 
3551 CCGGGGCTTC CCGTGTGCCA GGACCATCTT GAATTTTGGG AGGGCGTCTT 
GGCCCCGAAG GGCACACGGT CCTGGTAGAA CTTAAAACCC TCCCGCAGAA 



+ 2 TGL THID AHF LSO TKQ 
StuX 



3601 TACAGGCCTC ACTCATATAG ATGCCCACTT TCTATCCCAG ACAAAGCAGA 
ATGTCCGGAG TGAGTATATC TACGGGTGAA AGATAGGGTC TGTTTCGTCT 



+ 2SGEN LPY LVAY QAT VCA 
3651 GTGGGGAGAA CCTTCCTTAC CTGGTAGCGT ACCAAGCCAC CGTGTGCGCT 
CACCCCTCTT GGAAGGAATG GACCATCGCA TGGTTCGGTG GCACACGCGA 



+ 2RAQA PPP SWD QMHK CLI 
3701 AGGGCTCAAG CCCCTCCCCC ATCGTGGGAC CAGATGTGGA AGTGTTTGAT 
TCCCGAGTTC GGGGAGGGGG TAGCACCCTG GTCTACACCT TCACAAACTA 



+ 2 RLK PTLH GPT PLL YRL 
3751 TCGCCTCAAG CCCACCCTCC ATGGGCCAAC ACCCCTGCTA TACAGACTGG 
AGCGGAGrrC GQGTGGGiKGG TACCCGGTTG TGGGGACGAT ATGTCTGACC 



-fr2GAV0 NEI TLTH PVT KYI 
3801 GC6CT6TTCA 6AATGAAATC ACCCTGACGC ACCCAGTCAC CftAATACATC 
CGCGACAAGT CTTACTTTAG TGGGACTGCG T6GGTCAGTG GTTTATGTAG 



+2MTCM SAD LEV VTST WVL 
3851 ATGACATGCA TGTCGGCCGA CCTGGAGGTC GTCACGAGCA CCTGGGTGCT 
TACTGTACGT ACAGCCGGCT GGACCTCCAG CAGTGCTCGT GGACCCACGA 



♦2 VGG VLAA LAA YCL STG 
3901 CGTTGGCGGC GTCCTGGCTG CTTTGCCCGC GTATTCCCTG TCAACAGGCT 

GCAACCGCCG CAGGACCGAC GAAACCGGCG CATAACGGAC AGTTGTCCGA 
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3951 


c \i tr *T 
V V i 

GCGTGGTCAT 
CGCACCAGTA 


V G R 
AGTGGGCAGG 
TCACCCGTCC 


V V L S 
GTCGTCTTGT 
CAGCAGAACA 


Tf P 
CCGGGAAGCC 
GGCCCTTCGG 


a T T 
GGCAATCATA 
CCGTTAGTAT 


+ 2 
4001 


P D R E 
CCTGACAGGG 
GGACTGTCCC 


V L Y 
AAGTCCTCTA 
TTCAGGA6AT 


R E F 
CCGAGAGTTC 
GGCTCTCAAG 


D £ M E 
GATGAGATGG 
CTACTCTACC 


E C 
AAGAGTGCTA 
TTCTCACGAT 




BanHI 


Hlul 








4051 


GGATCCACTA 
CCTAGGTGAT 


CGCGTTAGAG 
GCGCAATCTC 


CTCGCTGATC 
GAGCGACTAG 


AGCCTCGACT 
TCGGAGCTGA 


GTGCCTTCTA 
CACGGAAGAT 


4101 


GTTGCCAGCC 
CAACGGTCGG 


ATCTGTTGTT TGCCCCTCCC 
TAGACAACAA ACGGGGAGG6 


CCGTGCCTTC 
GGCACGGAAG 


CTTGACCCTG 
GAACTGGGAC 


4151 


GAAGGTGCCA 
CTTCCACGGT 


CTCCCACTGT 
GAGGGTGACA 


CCTTTCCTAA 
GGAAAGGATT 


TAAAATGAGG 
ATTTTACTCC 


AAATTGCATC 
TTTAACGTAG 


4201 


GCATTGTCTG 
CGTAACAGAC 


AGTAGGTGTC 
TCATCCACAG 


ATTCTATTCT 
TAAGATAAGA 


GGGGGGTGGG 
CCCCCCACCC 


GTGGGGCAGG 
CACCCCGTCC 


4251 


ACAGCAAGGG 
TGTCGTTCCC 


GGAGGATTGG 
CCTCCTAACC 


GAAGACAATA 
CTTCTGTTAT 


GCAGGCATGC 
CGTCCGTACG 


TGGGGAGCTC 
ACCCCTCGAG 


4301 


TTCCGCTTCC 
AAGGCGAAGG 


TCGCTCACTG 
AGCGAGTGAC 


ACTCGCTGCG 
TGAGCGACGC 


CTCGGTCGTT 
GAGCCAGCAA 


CGGCTGCGGC 
GCCGACGCCG 


4351 


GAGCGGTATC 
CTCGCCATAG 


AGCTCACTCA 
TCGAGTGAGT 


AAGGCGGTAA 
TTCCGCCATT 


TACGGTTATC 
ATGCCAATAG 


CACAGAATCA 
GTGTCTTAGT 


4401 


GGGGATAACG 
CCCCTATTGC 


CAGGAAAGAA 
GTCCTTTCTT 


CATGTGAGCA 
GTACACTCGT 


AAAGGCCAGC 
TTTCCGGTCG 


AAAAGGCCAG 
TTTTCCGGTC 


4451 


GAACCGTAAA 
CTTGGCATTT 


AAGGCCGCGT 
TTCCGGCGCA 


TGCTGGCGTT 
ACGACCGCAA 


TTTCCATAGG 
AAAGGTATCC 


CTCCGCCCCC 
GAGGCGGGGG 


4501 


CTGACGAGCA 
GACTGCTCGT 


TCACAAAAAT C6ACGCTCAA 
AGTGTTTTTA GCTGCGAGTT 


GTCAGAGGTG 
CAGTCTCCAC 


GCGAAACCCG 
CGCTTTGGGC 


4551 


ACAGGACTAT 
TGTCCTGATA 


AAAGATACCA 
TTTCTATGGT 


GGCGTTTCCC 
CCGCAAAGGG 


CCTGGAAGCT 
GGACCTTCGA 


CCCTCGTGCG 
GGGAGCACGC 


4601 


CTCTCCTGTT 
GAGAGGACAA 


CCGACCCTGC 
GGCTGGGACG 


CGCTTACCGG ATACCTGTCC 
GCGAATGGCC TATGGACAGG 


GCCTTTCTCC 
CGGAAAGAGG 


4651 


CTTCGGGAAG 
GAAGCCCTTC 


CGTGGCGCTT 
GCACCGCGAA 


TCTCAATGCT 
AGAGTTACGA 


CACGCTGTAG 
GTGCGACATC 


GTATCTCAGT 
CATAGAGTCA 


4701 


TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGtGTGCACG AACCCCCCGT 
AGCCACATCC AGCAAGCGAG GTTCGACCCG ACACACGTGC TTGGGGGGCA 


4751 


TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT GAGTCCAACC 
AGTCGGGCTG GCGACGCGGA AtAGGCCATT GATAGCAGAA CTCAGGTTGG 


4801 


CGGTAAGACA 
GCCATTCTGT 


CGACTTATCG 
GCTGAATAGC 


CCACTGGCAG 
GGTGACCGTC 


CAGCCACTGG 
GTCGGTGACC 


TAACAGGATT 
ATTGTCCTAA 
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4851 


AGCaGAGCGA GGTATGTAGG CGGTGCTAwV (jA5TTCTTvsA AVjTssuIuVsuv. 
TCGTCTCGCT CCATACATCC GCCACGATGT CTCAAGAACT TCACCACCGG 




4901 


TAACTACGGC TACACTAGAA GGACA6TATT Tv>oTAT\;.Tlav.P <*^,l^iv»unjrt 

attgatgccg atgtgatctt cctgtcataa accatagacg cgagacgact 




4951 


AGCCAGTTAC CTTCGuAAAA mGAvjIIwjIM UL-l^iionX^ vv>oUrwift>-r*M 

tcggtcaatg gaagcctttt tctcaaccat cgagaactag gccgtttgtt 




5001 


ii^^K/^/*/»^*pr* ^•i»»/*/**'^r»*P/»/* •P*r»l"r«r*rTf;TT T^^fill/5r*lLnf l^f^ATTA^fnf*f^ 

ACCACCGCTG GTAGCGGTGG TTTTill«li A laC/tAVjUnuv* Auni iAV.*^^\9 

tggtggcgac catcgccacc aaaaaaacaa acgttcgtcg tctaatgcgc 




5051 


CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTlui At,v>i3Vj\j 1 va 
GTCTTTTTTT CCTAGAGTTC TTCTAGGAAA CTAGAAAAGA TGCCCCAGAC 




5101 


ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTlToul v.Ai\aHv»Ai iM 
TGCGAGTCAC CTTGCTTTTG AGTGCAATTC CCTAAAACCA GTACTCTAAT 




5151 


TCAAAAAGGA TCTTCACCTA GATCCTTTTA AATTAAAAAl LiAAviillnii 
AGTTTTTCCT AGAAGTGGAT CTAGGAAAAT TTAATTTTTA CTTCAAAATT 




5201 


ATCAATCTAA AGTATATATG AGTAAACTTG GTCTGACAGi ift^-uAAivj^i 
TAGTTAGATT TCATATATAC TCATTTGAAC CAGACTGTCA ATGGTTACGA 




5251 


TAATCAGTGA GGCACCTATC TCAGCGATCT GTCTATTTCG TTCATCCAIA 
ATTAGTCACT. CCGTGGATAG AGTCGCTAGA CAGATAAAGC AAGTAGGTAT 




5301 


GTTGCCTGAC TCCCCGTCGT GTAGATAACT ACGATACGGG AGGGtTTACL. 
CAACGGACTG AGGGGCAGCA CATCTATTGA TGCTATGCCC TCCCGAATGG 




5351 


ATCTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC TCACCGbt- i 
TAGACCGGGG TCACGACGTT ACTATGGCGC TCTGGGTGCG AGTGGCCGAG 




5401 


CAGATTTATC AGCAATAAAC CAGCCAGCCG 6AAGGGCCGA GLloUAGAAUi 
6TCTAAATAG TCGTTATTTG GTCGGTCGGC CTTCCCGGCT CGCGTCTTCA 




5451 


GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA 
CCAGGACGTT GAAATAGGCG GAGGTAGGTC AGATAATTAA CAACGGCCCT 




5501 


AGCtAGAGTA AGTAGTTCGC CAGTTAATAG TTTGCGCAAC gTTGTTGCUA 
TCGATCTCAT TCATCAAGCG GTCAATTATC AAACGCGTTG CAACAACGGT 




5551 


TTGCTACAGG CATCGTGGTG TCACGCTCGT CGTTTGGTAT GGCTTuATlU 
AACGATGTCC GTAGCACCAC AGTGCGAGCA GCAAACCATA CCGAAGTAAG 




5601 


AGCTCCGGTT CCCAACGATC AAGGCGAGTT ACATGATv-Cu tvATGiiGlG 
TCGAGGCCAA GGGTTGCTAG TTCCGCTCAA TGTACTAGGG GGTACAACAC 




5651 


CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC AGAAGTAAGT 
GTTTTTTCGC CAATCGAGGA AGCCAGGA6G CTAGCAACAG TCTTCATTCA 




5701 


TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 
ACCG6CGTCA CAATAGTGAG TACCAATACC GTCGTGACGT ATTAAGAGAA 




5751 


ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC 
TGACAGTACG GTAGGCATTC TACGAAAAGA CACTGACCAC TCATGAGTTG 
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SBOl 


CAAGTCATTC TGAGAATAGT 
GTTCA6TAAG ACTCTTATCA 


GTATGCGGCG 
CATACGCCGC 


ACCGAGTTGC 
TGGCTCAACG 


TCTTGCCCGG 
AGAACGGGCC 


S8S1 


CGTCAATACG GGATAATACC GCGCCACATA GCAGAACTTT AAAAGTGCTC 
GCAGTTATGC CCTATTATGG CGCGGTGTAT CGTCTTGAAA TTTTCACGAG 


5901 


ATCATTGGAA AACGTTCTTC 
TAGTAACCTT TTGCAAGAAG 


GGGGCGAAAA 
CCCCGCTTTT 


CTCTCAAGGA 

GAGAGTTCCT 


TCTTACCGCT 
AGAATGGCGA 


5951 


GTTGAGATCC AGTTCGATGT 
CAACTCTAGG TCAAGCTACA 


AACCCACTCG 
TTGGGTGA6C 


TGCACCCAAC 
ACGTGGGTTG 


TGATCTTCAG 
ACTAGAAGTC 


6001 


CATCTTTTAC TTTCACCAGC 
GTAGAAAATG AAAGTGGTCG 


GTTTCTGGGT 
CAAAGACCCA 


GAGCAAAAAC 
CTCGTTTTTG 


AGGAAGGCAA 
TCCTTCCGTT 


6051 


AATGCCGCAA AAAAGGGAAT 
TTACGGCGTT TTTTCCCTTA 


AAGGGCGACA CGGAAATGTT GAATACTCAT 
TTCCCGCTGT GCCTTTACAA CTTATGAGTA 


6101 


ACTCTTCCTT TTTCAATATT ATT6AAGCAT TTATCAGGGT TATTGTCTCA 
TGAGAAGGAA AAAGTTATAA TAACTTCGTA AATAGTCCCA ATAACAGAGT 


6151 


TGAGCGGATA CATATTTGAA 
ACTCGCCTAT GTATAAACTT 


TGTATTTAGA 
ACATAAATCT 


AAAATAAACA 
TTTTATTTGT 


AATAGGGGTT 
TTATCCCCAA 


6201 


CCGCGCACAT TTCCCCGAAA 
GGCGCGTGTA AAGGGGCTTT 


AGTGCCACCT 
TCAC6GTGGA 


GACGTCTAAG 
CTGCAGATTC 


AAACCATTAT 
TTTGGTAATA 


6251 


TATCATGACA TTAACCTATA 
ATAGTACTGT AATTGGATAT 


AAAATAGGCG 
TTTTATCCGC 


TATCACGAGG 

ATAGTGCTCC 


CCCTTTCGTC 
GGGAAAGCAG 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuVal 
2 AGCTTACAAAACAAATTCACCATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTA 
TCGAATGTTTTGTTTAAGTGGTACCGACGTATACGTCGAGTCCCGATATTCCACGATCAT 

1 HIND3, 21 NCOI, 30 NDEI, 58 SCAI, 

LeuAsnProSerValAlaAlaThrLeuGlyPheGlyAiaTyrMetSerLysAlaHisGly 
62 CTCAACCCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGG 
GAGTTGGGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCC 

IleAspProAsnlieArgThrGlyValArgThrlleThrThrGlySerProIleThrTyr 
122 ATCGATCCT AACATC AGGACCGGGGTGAGAACAATT ACCACTGGCAGCCCCATCACGTAC 
TAGCTAGGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATG 

A 

122 CLAI, 

SerThrTyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGIyAlaTyrAspIlelie 
182 TCCACCTACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATA 
AGGTGGATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTAT 

IleCysAspGluCysHisSerThrAspAlaThrSerXleLeuGlylleGlyThrValLeu 
242 ATTTGTGACGAGTGCCACTCCACGGATGCCACATCC ATCTTGGGCATTGGCACTGTCCTT 
TAAACACTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAA 

AspGlnAlaGluThrAlaGlyAiaArgLeuValValLeuAlaThrAlaThrProProGly 
302 GACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGC 
CTGGTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCG 

A 

309 ALWNl, 

SerValThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIle 
362 TCCGTCACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATC 
AGGCAGTGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAG 

ProPheTyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePhe 
422 CCTTTTTACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTC 
GGAAAAATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAG 

CysHisSerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsn 
482 TGTCATTCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAAT 
ACAGTAAGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTA 

AiaValAlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValVal 
542 GCCGTGGCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTC 
CGGCACCGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAG 

A A 

556 SAC2, 566 DRDl, 

. ValValAlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAsp 
602 GTCGTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGAC 
CAGCACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTG 

A 

621 BSPHl, 

CysAsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGlu 
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662 TGCAATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAG 
ACGTTATGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTC 

ThrlleThrLeuProGinAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArg 
722 ACAATCACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGG 
TGTTAGTGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCC 

GlyLysProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAsp 
782 GGG AAGCC AGGCATCT ACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGC ATGTTCGAC 
CCCTTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTG 

A A 

822 BGLI, 839 DRDl, 

SerSerValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAla 
842 TCGTCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCC 
AGCAGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGG 

887 SACI, 

GluThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAsp 
902 GAGACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGAC 
CTCTGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTG 

937 SMAI XMAI, 

HisLeuGluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeu 
962 CATCTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTA 
GTAGAACTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGAT 

A 

991 STUI, 

SerGlnThrLysGlnSerGiyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrVai 
1022 TCCCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTG 
AGGGTCTGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCAC 

A 

1075 DRA3, 

CysAlaArgAlaGlrvAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArg 
1082 TGCGCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGC 
ACGCGATCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCG 

LeuLysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsn 
1142 CTCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 
GAGTTCGGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTA 

A 

1156 NCOI, 

GluIleThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeu 
1202 GAAATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGGATGTCGGCCGACCTG 
CTTTAGTGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGAC 

AAA A A 

1236 BSPHl, 1240 DRDl, 1243 AVA3, 1251 EAGl XMA3, 1256 DRDl, 



GluVaXValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyr 
1262 GAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTAT 
rTrrAnrAGTGCTCGTGGACCCAGGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATA 
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CysLeuSerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAla 
1322 TGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCA 
ACGGACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGT 

1375 NAEI, 

IlelleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGln 
1382 ATCATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAG 
TAGTATGGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTC 

A 

1391 DRDl, 

HisLeuProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeu 
14 42 CACTTACCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAG6CCCTC 
GTGAATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAG 

GlyLeuLeuGlnThrAlaSerArgGlnAlaGluVallieAlaProAlaValGinThrAsn 
1502 GGCCTCCTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAAC 
CCGGAGGACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTG 

A A 

1508 PSTI, 1513 TTH3I, 

TrpGlnLysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGln 
1562 TGGCAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAA 
ACCGTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTT 

A A 

1571 XHOI, 1592 NDEI, 

TyrLeuAlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPhe 
1622 TACTTGGCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTT 
ATGAACCGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAA 

A 

1649 BSTE2, 

ThrAlaAlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGly 
1682 ACAGCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGG 
TGTCGACGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCC 

A 

1683 ALWNl PVU2, 

GlyTrpValAlaAlaGlnLeuAlaAlaProGlyAlaAiaThrAlaPheValGlyAlaGly 
1742 GGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGC 
CCCACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCG 

A 

1800 ESPl, 

LeuAlaGlyAlaAlalleGlyS^rValGlyLeuGlyLysValLeuIleAspIleLeuAla 
1802 TTAGCTGGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCA 
AATCGACCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGT 

A 

1808 KASl NARI, 

GlyTyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluVal 
1862 GGGTATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTC 
CCCATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAG 
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1884 SACI, 1905 BSPHl, 

ProSerThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuVal 
1922 CCCTCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTA 
GGGAGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCAT 

1934 TTH3I, 

ValGlyValValCysAiaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaVal 
1982 GTCGGCGTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTG 
CAGCCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCAC 

A A 

2010 NAEI, 2023 SMAI XMAX, 

GinTrpMetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHis 
2042 CAGTGGATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGC AC 
GTCACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTG 

2073 SMAI XMAI, 2099 DRA3, 

TyrValProGluSerAspAlaAIaAlaArgValThrAlalleLeuSerSerLeuThrVal 
2102 TACGTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTA 
ATGCACGGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACAT 

A 

2121 PVU2, 

ThrGlnLeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSer 
2162 ACCCAGCTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCC 
TGGGTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGG 

A A 

2165 ALWNl, 2170 MST2, 

GlySerTrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThr 
2222 GGTTCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACC 

CCAAGGACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGG. ■ ' 

2226 ECONl, 

TrpLeuLysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGinArg 
2282 TGGCTAAAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGC 
ACCGATTTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCG 

A A A 

2291 ESPl, 2306 PV02, 2316 BAMHI, 

GlyTyrLysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAla 
2342 GGGTATAAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCT 
CCCATATTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGA 

GluIleThrGiyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArg 
2402 GAGATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGG 
CTCTAGTGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCC 

A AAA 

2431 BSA61, 2447 AVR2, 2454 SSC83871, 2455 PSTI, 

AsnMetTrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeu 
24 62 AACATGTGGAGTGGGACCTTCCCCATTAATGCCT ACACCACGGGCCCCTGT ACCCCCCTT 
TTGTACACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAA 
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2486 ASEl, 2503 APAI, * 

E^oAlaProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIle 
2522 CCTGCGCCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATA 
GGACGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTAT6CACCTCTAT 

A 

2559 PSTI, 

ArgGlnValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysPro 
2582 AGGCAGGTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCG 
TCCGTCCACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGC 

A 

2600 ■DRA3, 

CysGlnValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPhe 
2642 TGCCAGGTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTT 
ACGGTCCAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAA 

AlaProProCysLysProLeuLeuArgGluGluValSerPheArgValGlylieuHisGlu 
2702 GCGCCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAA 
CGCGGGGGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTT 

TyrProValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSer 
27 62 TACCCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCC 
ATGGGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGG 

A A 

2763 HGIE2, 2815 AAT2, 

MetLeuThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGly 
2822 ATGCTCACTGATCCCTCCCATAT AACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGA 
TACGAGTGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCT 

A 

2856 EAGl XMA3, 

SerProProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAla 
2882 TCACCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCA 
AGTGGGGGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGT 

A 

2895 BALI, 2909 NHCI, 

ThrCysThrAlaAsnHisAspSerProAspAlaGluLeulleGluAlaAsnLeuLeuTrp 
2942 ACTTGC ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGG 
TGAACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACC 

A A 

2972 ESPl, 2975 SACI, 

ArgGlnGIuMetGlyGIyAsnlleThrArgValGluSerGluAsnLysValVallleLeu 
3002 AGGCAGGAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTG 
TCCGTCCTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGAC 

AspSerPheAspProLeuValAlaGiuGluAspGluArgGluIleSerVaiProAlaGlu 
3062 GACTCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAA 
CTGAGGAAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTT 

A 

3102 BGL2, 
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IleLeuArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyr 
3122 ATCCTGCGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTAT 
TAGGACGCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATA 

3X49 ALWNl, 3170 EAGl XMA3, 

AsnProProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGiy 
3182 AACCCCCCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGC 
TTGGGGGGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCG 

3223 HGIE2, 3235 NCOI, 

CysProLeuProProProLysSerProProValProProProArgLysLysArgThrVal 
3242 TGCCCGCTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTG 
ACGGGCGAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCAC 

ValLeuThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGly 
3302 GTCCTCACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGC 
CAGGAGTGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCG 

3338 SACI, 3352 HIND3, 

SerSerSerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaPro 
3362 AGCTCCTCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCT 
TCGAGGAGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGA 

SerGlyCysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGly 
3422 TCTGGCTGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGG 
AGACCGACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCC 

3443 EAMII051, 

GiuProGlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsn 
3482 GAGCCTGGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAAC 
CTCGGACCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTG 

/V A A 

3490 BAMHI, 34 91 BSABl, 3493 BSPEl, 

AiaGluAspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrPro 
3542 GCGGAGGATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCG 
CGCCTCCTACAGCACAC6ACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGC 

A 

3595 DRA3, 

CysAlaAlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHis 
3602 TGCGCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCAC 
ACGCGGCGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTG 

3606 SAC2, 3617 ALWNl, 3661 PFLMl, 

HisAsnLeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThr 
3662 CACAATTTGGTGT ATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACA 
GTGTTAAACCACATAAGGTGGTGGAGTGCGTCACGAACG6TTTCCG7CTTCTTTCAGTGT 

A 

3687 0RA3, 

PheAspArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAla 
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3722 TTTGACAGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCA 
AAACTGTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCAT6AGTTCCTCCAATTTCGT 

* AlaAiaSerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrPro 
3782 GCGGCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGAC6CCC 
CGCCGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGG 

A 

3822 HIND3, 

ProHisSerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArg 
3842 CCACACTCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGA 
GGTGTGAGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACG6TCT 

3881 •AAT2, 3896 BGLI, 

LysAlaValThrHisIljeAsnSerValTrpLysAspLeuLeuGluAspAsnVaiThrPro 
3902 AAGGCCGTAACCCACATCAACTCCGTGTG6AAAGACCTTCTGGAAGACAATGTAACACCA 
TTCCGGCATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGT 

IleAspThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGly 
3962 ATAGACACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGT 
TATCTGTGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCA 

ArgLysProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGiuLysMet 
4022 CGTAAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATG 
GCATTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTAC 

AlaLeuTyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPhe 
4082 GCTTTGTACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTC 
CGAAACATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAG 

GlnTyrSerProGlyGinArgValGluPheLeuValGlnAlaTrpLysSerLysLysThr 
414 2 CAATACTCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACC 
GTTATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGG 

A 

4166 ECORI, 

ProMetGlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIle 
4202 CCAATGGGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTC ACTGAGAGCGACATC 
GGTTACCCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAG 

A A 

4235 DRDl, 4242 ALWNl, 

ArgThrGluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalle 
4 2 62 CGTACGGAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATC 
GCATGCCTCCTCC6TTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAG 

A A 

4307 BGLI, 4314 BALI, 

LysSerLeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsn 
4322 AAGTCCCTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAAC 
TTCAGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTG 

A 

4 351 APAI, 

CysGlyTyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeu 
4382 TGCGGCTATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGT6GTAACACCCTC 
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ACGCCGATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAG 

ThrCysTyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysrhrMet 
4442 ACTTGCTACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATG 
TGAACGATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTAC 

4458 SMAI XMAI, 

LeuValCysGlyAspAspLeuValVallleCysGiuSerAlaGlyValGlnGluAspAla 
4502 CTCGTGTGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCG6GGGTCCAGGAGGACGCG 
GAGCACACACCGCT6CTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGC 

4514 DRDl, 4517 TTH3I, 

AiaSerLeuArgAiaPheThrGluAlaMetThrArgTyrSerAlaProProGiyAspPro 
4562 GCCAGCCTGPiOAGCCTTCACGGPiGCCTATGACCAGQTACTCCGCCCCCCCTGGGGACCCC 
CGCTCGGACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGG 

ProGlnProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAla 
4622 CCACAACCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCC 
GGTGTTGGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGG 

4643 SACI, 

HisAspGlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAla 
4682 CACGACGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCG 
GTGCTGCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGC 

4737 NRUI, 

ArgAlaAlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelle 
4742 AGAGCTGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATC 
TCTCGACGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAA6GACCGATCCGTTGTATTAG 

MetPheAlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeu 
4802 ATGTTTGCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTT 
TACAAACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAA 

A A 

4612 PfLMl, 4813 DRA3, 

IleAlaArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSer 
4862 ATAGCCAGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCC 
TATCGGTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGG 

4899 BGL2, . 

IleGluProLeuAspLeuProProIIelleGlnArgLeuHisGlyLeuSerAlaPheSer 
4922 ATAGAACCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCA 
TATCTTGGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGT 

A 

4960 NCOI, 

LeuHxsSerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGly 
4982 CTCCACAGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGG 
GAGGTGTCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCC 

A A 

5021 SPHI, 5041 KPNI, 
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ValProProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAla 
5042 GTACCGCCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCC 
CATGGCGGGAACGCTCGAACCTCTGTG6CCCGGGCCTCGCAGGCGCGATCCGAAGACCGG 

5070 APAI, 5097 BALI, 

ArgGlyGlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLys 
5102 AGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAG 
TCTCCTCCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTC 

5119 NDEI, 

LeuLysLeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAla 
5 1 62 CTCAAACTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCT 
GAGTTTGAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGA 

5180 NOTI, 5181 EAGl XMA3, 5188 BALI, 5192 PVU2, 

GlyTyrSerGlyGlyAspIleTyrHisSerVaiSerHisAlaArgProArgTrpIieTrp 
5222 GGCT ACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGG 
CCGATGTCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACC 

A 

5246 DRA3, 

PheCysLeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgOP 
5282 TTTTGCCTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAAGG 
AAAACGGATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTCC 

A M 

5301 PSTI, 5331 HGIE2, 



5342 TTGGGGTAAACACTCCGGCCTAAAAAAAAAAAAAAATCTAGAACCCGAGTCGAC 
AACCCGATTTGTGAGGCCGGATTTTTTTTTTTTTTTAGATCTTGGGCTCAGCTG 

A A 

5378 XBAI, 5390 SALI, 
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MetAlaAlaTyrAlaAIaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
, . TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATC ATGAGTTG 

I HIND3, 24 NDCI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylieAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GG3AGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

PrdAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCT AAC ATC AGGACCGGGGTGAGAACAATT ACCACTGGC AGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGG7GG 

TyrGlyLysPheLeuAlaAspGiyGlyCysSerGlyGlyAlaTyrAspIlelielleCys 
182 T ACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302- GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
4 22 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
4 82 TC AAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTC7TCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGIyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPHl . 
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662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATfGAGACAATr 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAAC7CTGT7AG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGiyArgGlvLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
762 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 3GLI, 833 DRDl, 

ValLeuCysGluCysTyrAspAlaGlyCysAiaTrpTyrGluLeuThrProAlaGiuThr 
8 4 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

A 

881 SACl, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CT7AAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

A 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATG7GGAAGTGTTTGATTCGCCTCAAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGG7C7ACACC7TCACAAAC7AAGCGGAG77C 

Pro7hrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
. * 1142 CCCACCCTCCATGGGCCAACACCCCTGC7ATACAGACTGGGCGCTGTTCAGAA7GAAA7C 
GGG7GGGAGG7ACCCGG77G7GGGGACGATA7GTCTGACCCGCGACAAGTC77AC7T7AG 

1150 NCOI, 

7hrLeuThrHisProValThrLysTyrIleMet7hrCysMetSerAlaAspLeuGluVal 
1202 ACCC7GACGCACCCAG7CACCAAA7ACA7CATGACA7GCA7G7CGGCCGACC7GGAGGTC 
7GGGAC7GCG7GGG7CAG7GG77TA7G7AG7AC7G7ACG7ACAGCCGGCTGGACC7CCAG 

A ^ A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRDl, 



Val7hrSer7hrTrpValLeuVaiGlyGlyValLeuAlaAlaLeuAlaAla7yrCysLeu 
1262 GTCACGAGCACC7GGG7GC7CG77GGCGGCG7CC7GGC7GC777GGCCGCG7A77GCCTG 
CAGTGC7CG7GGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 
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SerThrGlyCysValValIleValGlyArgValValLeuSerGlyI.ysProAlo:ie:ie 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCG7C7TGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

" 136? NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnKisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAA7 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGiuGlnPheLysGlnLysAiaLeuGiyLeu 

14 42 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 

GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACC AACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 

1 5 62 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 

TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

164 3 BSTE2, 1677 ALWNl PVU2. 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGlnLeuAiaAlaProGlyAlaAiaThrAlaPheValGlyAlaGlyLeuAla 
17 4 2 G7GGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGG7CGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAA7CGA 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAG7G77GGACTGGGGAAGG7CC7CA7AGACA7CC77GCAGGGTA7 
CCGCGGCGG7AGCCG7CACAACCTGACCCC77CCAGGAGTA7C7G7AGGAACG7CCCA7A 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
1862 GGCGCGGGCG7GGCGGGAGC7C77G7GGCA77CAAGA7CA7GAGCGG7GAGG7CCCC7CC 
CCGCGCCCGCACCGCCCTCGAGAACACCG7AAGT7C7AG7AC7CGCCAC7CCAGGGGAGG 

A A 

1878 SACI, 1899 BSPHl, 
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ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuVaiVaiGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTC3TAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

^1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGiuGlyAlaValGinTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

Me-AsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVai 
2042 ATQAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGC ACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

206*7 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCAT ACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTA7GAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTroAspTrDlleCysGluVai-euSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGAC7TTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCUCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylieProPheValSerCysGlnArgGiyTyr 
2232 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESPi, 2300 PVU2, 2310 BAMHI, 

LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlvHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TroSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
2462 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
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2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGG 

GGCT7GATGTGCAAGCGCGA7ACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGlr. 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAAT7TACGGGCACGGTC 

2594 DRAB, 

Val?roSerProGluPhePheThrGluLeaAspGlyValArgLeuHisArg?heAla?rc 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CA7CCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AA72, 

ThrAspProSerHisIle7hrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2922 ACTGa'tCCC7CCCA7ATAACAGCAGAGGCGGCCGGGCGAAGGT7GGCGAGGGGATCACCC 
TGAC7AGGGAGGG7A7A77G7CG7C7CCGCCGGCCCGC77CCAACCGC7CCCC7AG7GGG 

2350 EAGl XMA3, 

ProSerVaiAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2832 CCCTC7G7GGCCAGC7CC7CGGC7AGCCAGC7ATCCGC7CCA7C7C7CAAGGC AACTTGC 
GGGAGACACCGG7CGAGGAGCCGA7CGG7CGA7AGGCGAGG7AGAGAG77CCGT7GAACG 

2889 BALI, 2903 NHEI, 

7hrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeu7rpArgGln 
294 2 ACCGCTAACCA7GAC7CCCC7GA7GC7GAGC7CA7AGAGGCCAACC7CC7A7GGAGGCAG 
7GGCGA7TGG7ACTGAGGGGAC7ACGAC7CGAG7A7C7CCGG77GGAGGA7ACC7CCG7C 

2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnIle7hrArgValGluSerGluAsnLysValValIleLeuAspSer 
3002 GAGA7GGGCGGCAACA7CACCAGGG77GAG7CAGAAAACAAAGTGG7GA77C7GGAC7CC 
C7C7ACCCGCCG77G7AG7GG7CCCAAC7CAG7C7777G77TCACCAC7AAGACC7GAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 T7CGATCCGC77G7GGCGGAGGAGGACGAGCGGGAGA7C7CCG7ACCCGCAGAAATCC7G 
AAGC7AGGCGAAC:ACCGCC7CC7CC7GC7CGCCC7C7AGAGGCA7GGGCG7C777AGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProVal7rpAlaArgProAspTyrAsnPro 
3122 CGGAAG7C7CGGAGA7TCGCCCAGGCCC7GCCCG777GGGCGCGGCCGGAC7A7AACCCC 
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GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 
3143 ALWNl, 3164 EAGl XMA3, 

• ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPrc 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3 2 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaPro.SerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
34 22 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3 4 S 2 GGGGATCCGG ATCTT AGCGACGGGTC ATGGTCAACGGTCAGT AGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

A A 

3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTG7TA 

A ^ 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

A 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 
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SerLysValLysAlaAsnLeoLeuSerValGluGluAlaCysSerLeuThrProProHis 
5752 TCAAAAGT.GAAGGCTAACTTGCTATCCGTAGAG3AAGC7TGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAAC6ATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

38I6HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAiaArgLysAla 
3342 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3675 AAT2, 3890 BGLI. 

ValTh-HisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACAT7GTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGiyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGT AAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGG7TATG 

Se-o-oGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4-42 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIieArgThr 
4 202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

A A 

4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
r 4 262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

A A 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 C7CACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAG7GGCTCTCCGAAATACAACCCCCGGGAGAATGG7TAAGTTCCCCCC7C77GACGCCG 

A 

4345 APAI, 

7yrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4382 7A7CGCAGG7GCCGCGCGAGCGGCG7AC7GACAAC7AGC7G7GG7AACACCCTCACTTGC 
A7AGCG7CCACGGCGCGC7CGCCGCA7GAC7GT7GA7CGACACCATTGTGGGAGTGAACG 
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TyrZieLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMecLe-jVal 
44 42 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 4 52 SMAI XMAI, 

CysGlyAspAspLeuValVallieCysGluSerAlaGlyValGlnGiuAspAlaAlaSer 
4502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4308 DRDi, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGiyAsp?ro?roGir. 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaKisAsp 
4622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelieMetPhe 
4 74 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGiuIieTyrGiyAlaCysTyrSerlleGlu 
4 862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 • AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
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5042 CCCT7GCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI. 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5 1 c2 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACPAAGTGCCGACCGATG 

5174 NOTI, 5175 EAGl XMA3. 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgOP 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAATAGTCGAC 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTATCAGCTG 

5295 PSTI, 5336 SALI, 
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MetAlaAiaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAiaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACA7AATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVai 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrVaiProHisProAsnlleGluGluValAlaLeuSerThrThrGiyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlaXleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAIaLeuGlylleAsnAlaVal 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
• AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

/I ^ 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVairieAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 3SPH1, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRDl, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGT ATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGT6GAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGinAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGC6CTGTTCAGAATGAAATC 
. . GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVai 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

AAA A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

A 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1392 CCTGAC AGGGAAGTCCTCTACCGAGAGTTCGATG AGATGGAAGAGTGCTCTCAGCACTT A 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGT7ATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

A A 

1502 PSTI. 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelieSerGlylleGlnTyrLeu 
' 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

A A 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A M 

164 3 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATT6GTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAIaThrAlaPheValGlyAla'GlyieuAia 
1742 GTGGCTGCCCAGCTCGCCGCCCXCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESPl, 

GlyAIaAlalleGlySerValGlyLeuGlyLysValLeuIleAspXleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGT AT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCA7A 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMecSerGlyGluVaiProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACT ACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2113 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIie 
2342 AAGGGGGTCTGGCGAGGGGAtGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTC7AG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

A AAA 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAia 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

^ A 

24 80 ASEl, 24 97 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAiaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGC AGAGGAAT ACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

A 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

A 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
2642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGiuGluValSerPheArgValGlyLeuHisGluTyrPro 
2 7 C 2 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACG7TCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

A 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

A 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

A A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTroArg3ir. 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGiuIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

A 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTAT AACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

A. A 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProVaiProProProArgLysLysArgThrValValLeu 
3242 CTTCC ACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGG AAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
34 22 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACA6GCGCACTCGTCACCCCGT6CGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, " 

• AlaGluGluGinLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 "GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

A ^ 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAs? 
3662 TTGGTGT ATTCC ACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTG AC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAAC7G 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3"? 22 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTG ACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A A 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGIyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

A 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4 308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 

4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
4 442 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

A 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

A 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AiaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 0RA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGXu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCAiAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4 893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGiuIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

. 5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
r 5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGXnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
5462 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
. . CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

A AAA 

5548 ALWNl, 5558 ESPl, 5564 SMAI X^4AI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSer'ProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGfGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysOC AM 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGTAATAGTCG 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCATTATCAGC 

A 

A 

5650 APAI, 5698 SALI, 



5702 AC 
TG 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
■ TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

1X6 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCT AACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATT7GT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AiaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

A 

303 ALWNl, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
' 482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVailleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCA6TAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 5AC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 
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ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleoluThrlie 
662 ACGTGT3TCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAA6TCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCC7TC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

A A 

816 BGLI, 833 DRDl. 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

A 

881 SACI, 

ThrValArgLeuArgAiaTyrMetAsnThrProGlyLeuProValCysGinAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGT7TCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTG6GAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGiuVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

AAA A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRD1» 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAIaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
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CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCA7AACGGAC 

SerThrGiyCysValValllfeValGlyArgValValLeuSerGlyLysProAiallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 42 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOZ, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A A 

164 3 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGinLeuAlaAiaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
174 2 GTGGCTGCCC AGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

A 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

A 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

A A 

1878 SACI, 1899 BSPHl, 



69/100 



wo 01/38360 



PCT/USOO/32326 



FIGURE 18 -Page 4 



ThrGluAsoLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAiaValGinTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 mZl, 2017 St4AI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGC ACT ACG7G 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

At 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A A A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 

LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
r 2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCC ACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
24 02 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

A ^ A A 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A A 

2480 AS£1, 2497 APAI, 
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ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGXuIieArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

'2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGin 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
2642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCvsLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCT CTGTGGCC AGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHCI« 

ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpArgGln 
294 2 ACCGCT AACC ATG ACTCCCCTGATGCTGAGCTCATAG AGGCCAACCTCCT ATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

A A 

2966 ESPlr 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

A 

3096 6GL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
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3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACfATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
324 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GiyAspProAsoLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGG7TGCGCC7C 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGinValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
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TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCG7CGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
. AGTTTTCACTTCCGATTGAAC6ATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A A 

3875 AAT2, 3890 BGLI, 

VaiThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlieMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleVaiPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
414 2 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

A A 

4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

A A 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

A 

4345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 
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TyrlieLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVai 
44 42 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTG AGAGCCTTCACGGAGGCT ATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI. 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4 7 4 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
4 8 62 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

A 

4 893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

A 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

^ A 

5015 SPHI, 5035 KPNI, 
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ProLeuArgAiaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

.. 5064 APAI, 5091 BALI, 

GiyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

A 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTC AAAGAAAGACCAAACGTAAC ACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 

A 

5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

A A A A 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

A AAA 

5548 ALWNl, 5558 CSPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 
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ProSerTroGlyProThrAspFroArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

56S0 APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeu 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

5724 HGIE2, 5750 KASl NARI, 5756 ECONl, 

GlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgValLeuGluAspGlyValAsnTyr 
5762 GGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGAACTAT 
CCTCCGCGACGGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTGCCGCACTTGATA 

5772 BSTXI. 5775 APAI, 

AlaThrGlyAsnLeuProGlyCysSerOC AM 
5822 GCAACAGGGAACCTTCCTGGTTGCTCTTAAT AGTCGAC 
CGTTGTCCCTTGGAAGGACCAACGAGAATTATCAGCTG 

5854 SALI, 
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MecAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAA7GGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAiaTyrMetSerLysAlaHisGlylleAso 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGC7A 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CC7AACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGiyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCT7ATGACAT AATAATTTGT 
A7GCCG77CAAGGAACGGC7GCCGCCCACGAGCCCCCCGCGAA7AC7G7A77A77AAACA 

AspGluCysHisSer7hrAspAlaThrSerIleLeuGiyIleGly7hrValLeuAspGln 
• 242 GACGAG7GCCACTCCACGGATGCCACA7CCA7C77GGGCA77GGCAC7G7CC77GACCAA 
C76C7CACGG7GAGG7GCC7ACGGTGTAGG7AGAACCCG7AACCG7GACAGGAAC7GG77 

AlaGlu7hrAlaGlyAlaArgLeuValValLeuAla7hrAla7hrProProGlySerVal 
302 GCAGAGAC7GCGGGGGCGAGAC7GG77G7GC7CGCCACCGCCACCCC7CCGGGC7CCG7C 
CG7C7C7GACGCCCCCGCTC7GACCAACACGAGCGG7GGCGG7GGGGAGGCCCGAGGCAG 

303 ALWNl, 

7hrVaiProHisProAsnIleGluGluValAlaLeuSer7hr7hrGlyGluIleProPhe 
362 AC7G7GCCCCA7CCCAACA7CGAGGAGG77GC7C7G7CCACCACCGGAGAGA7CCC7777 
7GACACGGGG7AGGG77G7AGC7CC7CCAACGAGACAGG7GG7GGCC7C7C7AGGGAAAA 

7yrGlyLysAlaIleProLeuGluVaiIleLysGlyGlyArgHisLeuIlePheCysHis 
422 7 ACGGCAAGGC7A7CCCCCTCGAAG7AATCAAGGGGGGGAGACATCTCATCT7CTGTCAT 
A7GCCG77CCGA7AGGGGGAGCTTCAT7AGT7CCCCCCC7C7G7AGAG7AGAAGACAG7A 
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SerLysLysLysCysAspGluL^uAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
48? TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
,AGTTTCTTCTTCACGCTGCTTGAGC6GCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATG6CGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 ORDl, 

AiaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCGACTATCTGACGTTA 

615 BSPHl, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCA6CCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DROl, 

ValLeuCysGiuCysTyrAspAlaGIyCysAlaTrpTyrGiuLeuThrProAlaGluThr 
8 4 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGT ATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGA6TGCGGGCGGCTCTGA 

A 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

A 

985 STVl, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAA6GAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAA'GCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGinAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAA7GAAATC 
. GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTG6TTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

AAA A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

A 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 42 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTeAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

A /\ 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
- ' 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A A 

1643 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAia 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

, 1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

A A 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisVaiSerProThrHisTyrVal 
204 2 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

A ^ 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

A A 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

A A 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylieProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A A A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACtGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTG7AC 

/S AAA 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
2 4 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A ^ 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGiuIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGAT AAGGC AG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

A 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

A 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

A 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

A 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

A 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

A A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTroArgGin 
29<2 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTC6AGTATCTCCGGTTGGAGGATACCTCCGTC 

. 2966 £SP1» 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

321^7 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGiuSerThrLeuSerThrAl^LeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCG AGCTCGCC ACCAGAAGCTTTGGC AGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

AAA 

3484 BAMHI, 3485 BSAB1« 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, ' 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3 60 2 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCT ACGTC ACCAC AAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTG AC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCC AAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGinArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

A 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAG6CATGC 
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4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgVaiAlalleLysSer 
4262 GAGGAGGeAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysoiy 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 345 APAI. 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
44 42 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4452 SMAI XMAI, 

CysGXyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAIaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaKisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMecIleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATG ATACTGATGACCCATTTCTTT AGCGTCCTT ATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCA7AGA.-. 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTT ACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAG AAAACTTGGGG7 ACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGG AGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI. 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 DRA3, 

LeuLeuLeuLeuAlaAiaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
r 5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCT AAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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544 9 APAI, 

GlyValArgAiaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGinPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAiaGinProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

A AAA 

5548 ALWNl, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTC7ATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

r ^ 

5650 APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValOC AM 
5702 ACCCTT ACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCTAATAGTCGAC 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGATTATCAGCTG 

^ A 

5724 HGIE2, 5755 SALI, 
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MecAiaAlaTyrAlaAlaGinGlyTyrLysValLeuValleuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIN03, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCC7CTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProf leThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAAT7ACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCG7TCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGin 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCC3TC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVaLIleLysGlyGiyArgHisLeuILePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAIaVai 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
5 4 2 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

A A 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGiyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGT.TGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPHl, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlieGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
7GCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCT ACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

A ^ 

816 BGLI, 833 DRDl, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
8 4 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

A 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GiuPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

A 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRAB, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrt>roLeuLeuTyrArgLeuGiyAiaValGlnAsnGluIie 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATAC AGACTGGGCGCTGTTCAGAA7GAAATC 
• GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI. 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGiuVal 
1202 ACCCTGACGCACCCAGTCACC AAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

Ser.ThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlalleXle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGiuValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTT A 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 42 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

A A 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGiylleGlnTyrLeu 
, • 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

A A 

1565 XHOI, 1586 NOCI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

1643 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCT ACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGAC6GGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

^1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluVaiProSer 
1862 GGGGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI. 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I. 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAiaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGiySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTG6TCACCTATTCGAGCCTCACATGGTGAGGTACGA6GCCAAGG 

A 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCT AAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
, ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

A AAA 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A A 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

A 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

A 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
2642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

A 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

A 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAiaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

A 

2850 CAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

^ A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpAtgGln 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

A A 

2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallieLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAsoProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

A 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 C6GAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

A ^ 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

A ^ 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGiuSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGG6GGGGACCTCCCCCTC6GA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

A A A 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRAB, 3600 SAC2, * 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACAT77GAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

A 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
. AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A A 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleVaiPheProAspLeuGlyValArgVaiCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGinTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

A 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4 2 62 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCC6CGTGGCCATCAAGTCC 
• • CTCCTCCGTTAGATGGTTACAACACTGGAGCTG6GGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

A 

4345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
44 4 2 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

A • 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

A <N 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGJiyAspProProGln 
4562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

A 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

A 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
474 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A ^ 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACf CCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

PrqLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 92 2 CCACTGGATCT ACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

A A 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
504 2 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GlyArgAlaAialleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGC AGGGCTGCCAT ATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

A 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGIyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 0RA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
6GCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 



98/100 



wo 01/38360 



PCTAJSOO/32326 



FIGURE 22 - Page 10 

5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
5462 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
* . CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

A A A A 

5467 6SSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

A A A A 

5548 ALWNl, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

A A 

5650 APAI, 5696 ClAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeu 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

A A A 

5724 HGIE2, 5750 KASl NARI, 5756 ECONl, 

GlyGlyAlaAlaArgAlaOC AM 
5762 GGAGGCGCTGCCAGGGCCTAATAGTCGAC 
CCTCCGCGACGGTCCCGGATTATCAGCTG 

5785 SALI, 
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<110> CHIRON CORPORATION et al . 

<120> NOVEL HCV NON- STRUCTURAL POLYPEPTIDE 

<130> PP01617.003 

<140> 
<141> 

<160> 19 

<170> PatentIn Ver. 2.0 

<210> 1 
<211> 9620 
<212> DNA 

<213> Artificial Sequence 

<220> 
<221> CDS 

<222> (1990) . . (7302) 
<220> 

<223> Description of Artificial Sequence: Hepatitis C pns345 
<400> 1 



cgcgcgtttc 


ggtgatgacg 


gtgaaaacct 


ctgacacatg 


cagctcccgg 


agacggtcac 


60 


agcttgtctg 


taagcggatg 


ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 




ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


160 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg 


ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


54 0 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 


cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 
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tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


ccgttgacgc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagctcgt 


ttagtgaacc 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacct 


ccatagaaga 


caccgggacc 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaacgcg 


gattccccgt 


gccaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acaccccttt 


ggctcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctccttatgc 


tataggtgat 


ggtatagctt 


1260 


agcctatagg 


tgtgggttat 


tgaccattat 


tgaccactcc 


cctattggtg 


acgatacttt 


1320 


ccattactaa 


tccataacat 


ggctctttgc 


cacaactatc 


tctattggct 


atatgccaat 


1380 


actctgtcct 


tcagagactg 


acacggactc 


tgtattttta 


caggatgggg 


tccatttatt 


1440 


atttacaaat 


tcacatatac 


aacaacgccg 


tcccccgtgc 


ccgcagtttt 


tattaaacat 


1500 


agcgtgggat 


ctccgacatc 


tcgggtacgt 


gttccggaca 


tgggctcttc 


tccggtagcg 


1560 


gcggagcttc 


cacatccgag 


ccctggtccc 


atccgtccag 


cggctcatgg 


tcgctcggca 


1620 


gctccttgct 


cctaacagtg 


gaggccagac 


ttaggcacag 


cacaatgccc 


accaccacca 


1680 


gtgtgccgca 


caaggccgtg 


gcggtagggt 


atgtgtctga 


aaatgagctc 


ggagattggg 


1740 


ctcgcacctg 


gacgcagatg 


gaagacttaa 


ggcagcggca 


gaagaagatg 


caggcagctg 


1800 


agttgttgta 


ttctgataag 


agtcagaggt 


aactcccgtt 


gcggtgctgt 


taacggtgga 


1860 


gggcagtgta 


gtctgagcag 


tactcgttgc 


tgccgcgcgc 


gccaccagac 


ataatagctg 


1920 


acagactaac 


agactgttcc 


tttccatggg 


tcttttctgc 


agtcaccgtc 


gtcgacctaa 


1980 



gaattcacc atg get gca tat gca get cag ggc tat aag gtg eta gta etc 2031 
Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 
15 10 

aac ccc tct gtt get gca aca ctg ggc ttt ggt get tac atg tec aag 2079 
Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 
15 20 25 30 

get cat ggg ate gat cet aac ate agg acc ggg gtg aga aca att acc 2127 
Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr 
35 40 45 

act ggc age ccc ate aeg tac tec acc tac ggc aag ttc ctt gcc gae 2175 
Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp 
50 55 60 

ggc ggg tgc teg ggg ggc get tat gae ata ata att tgt gac gag tgc 2223 
Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys 
65 70 75 

cac tee aeg gat gee aca tec ate ttg ggc att ggc act gtc ctt gae 2271 
His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp 
80 85 90 
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caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gcc acc gcc acc 2319 
Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr 
95 ICQ 105 110 

cct ccg ggc tec gtc act gtg ccc cat ccc aac ate gag gag gtt get 2367 
Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala 
115 120 125 

ctg tec acc acc gga gag ate cct ttt tac ggc aag get ate ccc etc 2415 
Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala He Pro Leu 
130 135 140 

gaa gta ate aag ggg ggg aga cat etc ate tte tgt cat tea aag aag 2463 
Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys 
145 150 155 

aag tgc gac gaa etc gcc gca aag ctg gtc gca ttg ggc ate aat gcc 2511 
Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala 
160 165 170 

gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ccg acc age ggc 2559 
Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly 
175 180 185 190 

gat gtt gtc gtc gtg gca acc gat gee etc atg acc ggc tat acc ggc 2607 
Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly 
195 200 205 

gac tte gac teg gtg ata gac tgc aat acg tgt gtc acc cag aca gtc 2655 
Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val 
210 215 220 

gat tte age ctt gac cct ace tte ace att gag aca ate acg etc ccc 2703 
Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro 
225 230 235 

caa gat get gtc tec cgc act caa cgt egg ggc agg act ggc agg ggg 2751 
Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly 
240 245 250 

aag cea ggc ate tac aga ttt gtg gca ccg ggg gag cgc ccc tec ggc 2799 
Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly 
255 260 265 270 

atg tte gac teg tec gtc etc tgt gag tgc tat gac gca ggc tgt get 2847 
Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala 
275 280 285 

tgg tat gag etc acg ccc gcc gag act aca gtt agg eta cga gcg tac 2895 
Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr 
290 295 300 

atg aac acc ccg ggg ctt ccc gtg tgc cag gac eat ctt gaa ttt tgg 2943 
Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp 

305 310 315 

gag ggc gtc ttt aca ggc etc act eat ata gat gee cac ttt eta tec 2991 
Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser 
320 325 330 
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cag aca aag cag agt ggg gag aac ctt cct tac ctg gta gcg tac caa 3039 
Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin 
335 340 345 350 

gcc acc gtg tgc get agg get caa gcc cct ccc cca teg tgg gac cag 3087 
Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin 
355 360 365 

«^^9 tgg aag tgt ttg att cgc etc aag ccc acc etc cat ggg cca aca 3135 
Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr 
370 375 380 

ccc ctg eta tac aga ctg gge get gtt cag aat gaa ate aec ctg aeg 3183 
Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr 
385 390 395 

cac cca gtc acc aaa tac ate atg aca tgc atg teg gee gac ctg gag 3231 
His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu 
400 405 410 

gtc gtc acg age acc tgg gtg etc gtt gge gge gtc ctg get get ttg 3279 
Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu 
415 420 425 430 

gcc gcg tat tgc ctg tea aca gge tgc gtg gtc ata gtg gge agg gtc 3327 
Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val 
435 440 445 

gtc ttg tee ggg aag ccg gea ate ata cct gac agg gaa gtc etc tac 3375 
Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr 
450 455 460 

ega gag ttc gat gag atg gaa gag tgc tct cag cac tta ccg tac ate 3423 
Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He 
465 470 475 

gag caa ggg atg atg etc gee gag cag ttc aag cag aag gee etc gge 3471 
Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly 
480 485 490 

etc ctg cag acc gcg tec cgt cag gea gag gtt ate gcc cct get gtc 3519 
Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val 
495 500 505 510 

cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag cat atg tgg 3567 
Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp 
515 520 525 

aac ttc ate agt ggg ata caa tac ttg gcg gge ttg tea acg ctg cct 3615 
Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro 
530 535 540 

ggt aac ccc gcc att get tea ttg atg get ttt aca get get gtc acc 3663 
Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr 
545 550 555 

age cca eta acc act age caa aec etc etc ttc aac ata ttg ggg ggg 3711 
Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly 
560 565 570 
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tgg gtg get gcc cag etc gee gcc ccc ggt gee get act gee ttt gtg 3759 
Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val 
575 580 585 590 

ggc get ggc tta get ggc gee gee ate gge agt gtt gga ctg ggg aag 3 807 
Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys 
595 600 605 

gtc etc ata gac ate ctt gea ggg tat gge geg gge gtg geg gga get 3855 
Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
610 615 620 

ctt gtg gea tte aag ate atg age ggt gag gtc ccc tec aeg gag gac 3903 
Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp 
625 630 635 

ctg gtc aat eta ctg eee gee ate etc teg ccc gga gee etc gta gtc 3951 
Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val 
640 645 650 

gge gtg gtc tgt gea gea ata ctg ege egg cac gtt ggc ecg ggc gag 3999 
Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu 
655 660 665 670 

ggg gea gtg cag tgg atg aac egg ctg ata gee tte gee tee egg ggg 4047 
Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly 
675 680 685 

aac cat gtt tec ccc acg cac tac gtg ecg gag age gat gea get gee 4095 
Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala 
690 695 700 

cgc gtc act gcc ata etc age age etc act gta ace cag etc ctg agg 4143 
Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg 
705 710 715 

ega ctg cac cag tgg ata age teg gag tgt ace act cea tge tee ggt 4191 
Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly 
720 725 730 

tec tgg eta agg gac ate tgg gac tgg ata tge gag gtg ttg age gac 4239 
Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp 
735 740 745 750 

ttt aag acc tgg eta aaa get aag etc atg cea cag ctg ect ggg ate 4287 
Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He 
755 760 765 

ccc ttt gtg tec tge cag ege ggg tat aag ggg gtc tgg ega ggg gac 4335 
Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp 
770 775 780 

ggc ate atg cac act cgc tge cac tgt gga get gag ate act gga cat 4383 
Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His 
785 790 795 

gtc aaa aac ggg acg atg agg ate gtc ggt ect agg acc tge agg aac 4431 
Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn 
800 805 810 
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atg tgg agt ggg acc ttc ccc att aat gcc tac acc acg ggc ccc tgt 4479 
Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys 
815 820 825 830 

acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg agg gtg tct 4527 
Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser 
835 840 845 

gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc cac tac gtg 4575 
Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val 
850 855 860 

acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag gtc cca teg 4623 
Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser 
865 870 875 

CCC gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat agg ttt gcg 4671 
Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala 
880 885 890 

ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc aga gta gga 4719 
Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly 
895 900 905 910 

etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag ccc gaa ccg 4767 
Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro 
915 920 925 

gac gtg gcc gtg ttg acg tec atg etc act gat ccc tec cat ata aca 4 815 
Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr 
930 935 940 

gca gag gcg gee ggg ega agg ttg gcg agg gga tea ccc ccc tct gtg 4 863 
Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val 
945 950 955 

gcc age tec teg get age cag eta tec get cca tct etc aag gca act 4911 
Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr 
960 965 970 

tgc acc get aac cat gac tec cct gat get gag etc ata gag gcc aac 4959 
Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn 
975 980 985 990 

etc eta tgg agg cag gag atg ggc ggc aac ate acc agg gtt gag tea 5007 
Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser 
995 1000 1005 

gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt gtg gcg gag 5055 
Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu 
1010 1015 1020 

gag gac gag egg gag ate tec gta ccc gca gaa ate ctg egg aag tct 5103 
Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser 
1025 1030 1035 

egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg gac tat aac 5151 
Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn 
1040 1045 1050 
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ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa cca cct gtg 5199 
Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val 
1055 1060 1065 1070 

gtc cat ggc tgc ccg ctt cca cct cca aag tec cct cct gtg cct ccg 5247 
Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro 
1075 1080 1085 

cct egg aag aag egg acg gtg gtc etc act gaa tea acc eta tot act 5295 
Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr 
1090 1095 1100 

gcc ttg gee gag etc gcc acc aga age ttt ggc age tec tea act tec 5343 
Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser 
1105 1110 1115 

ggc att acg ggc gac aat acg aca aca tec tct gag ccc gcc cct tct 5391 
Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser 
1120 1125 1130 

ggc tgc ccc ccc gac tec gac get gag tec tat tec tec atg ccc ccc 5439 
Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1135 1140 1145 1150 

etg gag ggg gag cct ggg gat ccg gat ctt age gac ggg tea tgg tea 5487 
Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser 
1155 1160 1165 

acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc tgc tea atg 5535 
Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met 
1170 1175 1180 

tct tac tct tgg aca ggc gea etc gtc acc ccg tgc gcc gcg gaa gaa 5583 
Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu . 
1185 1190 1195 

eag aaa etg ccc ate aat gea eta age aac teg ttg eta egt cae cac 5631 
Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His 
1200 1205 1210 

aat ttg gtg tat tec acc acc tea ege agt get tgc caa agg eag aag 5679 
Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys 
1215 1220 1225 1230 

aaa gtc aca ttt gac aga etg caa gtt etg gac age cat tac cag gac 5727 
Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp 
1235 1240 1245 

gta etc aag gag gtt aaa gea gcg gcg tea aaa gtg aag get aac ttg 5775 
Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu 
1250 1255 1260 

eta tec gta gag gaa get tgc age etg acg ccc cca cac tea gcc aaa 5823 
Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys 
1265 1270 1275 

tee aag ttt ggt tat ggg gea aaa gac gtc egt tgc eat gee aga aag 5871 
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys 
1280 1285 1290 
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gcc gta acc cac ate aac tec gtg tgg aaa gae ctt ctg gaa gac aat 5919 
Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn 
1295 1300 1305 1310 

gta aca cca ata gac act acc ate atg get aag aac gag gtt ttc tgc 5967 
Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys 
1315 1320 1325 

gtt cag cct gag aag ggg ggt cgt aag cca get cgt etc ate gtg ttc 6015 
Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe 
1330 1335 1340 

ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg tac gac gtg 6063 
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val 
1345 1350 1355 

gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac gga ttc caa 6111 
Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin 
1360 1365 1370 

tac tea cca gga cag egg gtt gaa ttc etc gtg caa gcg tgg aag tec 6159 
Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser 
1375 1380 1385 1390 

aag aaa acc cca atg ggg ttc teg tat gat acc cgc tgc ttt gac tec 6207 
Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser 
1395 1400 1405 

aca gtc act gag age gac ate cgt aeg gag gag gea ate tac caa tgt 6255 
Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys 
1410 1415 1420 

tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec etc ace gag 6303 
Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu 
1425 1430 1435 

agg ctt tat gtt ggg ggc cct ctt ace aat tea agg ggg gag aac tgc 6351 
Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys 
1440 1445 1450 

ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act age tgt ggt 6399 
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly 
1455 1460 1465 1470 

aac ace etc act tgc tac ate aag gee egg gea gee tgt ega gee gca 6447 
Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala 
1475 1480 1485 

ggg etc cag gae tgc acc atg etc gtg tgt ggc gae gac tta gtc gtt 6495 
Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val 
1490 1495 1500 

ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age ctg aga gcc 6543 
He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala 
1505 1510 1515 

ttc aeg gag get atg ace agg tac tec gee ccc cct ggg gae ccc cca 6591 
Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro 
1520 1525 1530 
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caa cca gaa tac gac ttg gag etc ata aca tea tgc tec tec aac gtg 6639 
Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val 
1535 1540 1545 1550 

tea gtc gee eac gae gge get gga aag agg gte tae tae etc ace cgt 6687 
Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg 
1555 1560 1565 

gac ect aca acc ccc etc geg aga get gcg tgg gag aca gca aga eac 6735 
Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His 
1570 1575 1580 

act cca gte aat tec tgg eta ggc aac ata ate atg ttt gee ccc aca 6783 
Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr 
1585 1590 1595 

ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age gtc ctt ata 6831 
Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He 
1600 1605 1610 

gee agg gae eag ctt gaa cag gee etc gat tgc gag ate tae ggg gee 6879 
Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala 
1615 1620 1625 1630 

tgc tac tec ata gaa cca ctg gat eta ect cca ate att caa aga etc 6927 
Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu 
1635 1640 1645 

cat ggc etc age gca ttt tea cte eac agt tac tct cca ggt gaa ate 6975 
His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He 
1650 1655 1660 

aat agg gtg gee gca tgc etc aga aaa ctt ggg gta eeg ccc ttg cga 7023 
Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg 
1665 1670 1675 

get tgg aga eac egg gcc egg age gtc egc get agg ctt ctg gee aga 7071 
Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg 
1680 1685 1690 

gga ggc agg get gcc ata tgt ggc aag tae etc ttc aac tgg gca gta 7119 
Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val 
1695 1700 1705 1710 

aga aca aag cte aaa etc act cca ata geg gee get ggc cag ctg gac 7167 
Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp 
1715 1720 1725 

ttg tec gge tgg ttc aeg get gge tac age ggg gga gac att tat eac 7215 
Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His 
1730 1735 1740 

age gtg tct cat gcc egg ccc egc tgg ate tgg ttt tgc eta etc ctg 7263 
Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu 
1745 1750 1755 

Ctt get gca ggg gta gge ate tac cte etc ccc aac cga tgaaggttgg 7312 
Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 
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ggtaaacact 


ccggcctaaa 


aaaaaaaaaa 


aatctagaaa 


ggcgcgccaa 


gatatcaagg 


7372 


atccactacg 


cgttagagct 


cgctgatcag 


cctcgactgt 


gccttctagt 


tgccagccat 


7432 


ctgttgtttg 


cccctccccc 


gtgccttcct 


tgaccctgga 


aggtgccact 


cccactgtcc 


7492 


tttcctaata 


aaatgaggaa 


attgcatcgc 


attgtctgag 


taggtgtcat 


tctattctgg 


7552 


999gtggggt 


ggggcaggac 


agcaaggggg 


aggattggga 


agacaatagc 


aggcatgctg 


7612 


gggagctctt 


ccgcttcctc 


gctcactgac 


tcgctgcgct 


cggtcgttcg 


gctgcggcga 


7672 


gcggtatcag 


ctcactcaaa 


ggcggtaata 


cggttatcca 


cagaatcagg 


ggataacgca 


7732 


ggaaagaaca 


tgtgagcaaa 


aggccagcaa 


aaggccagga 


accgtaaaaa 


ggccgcgttg 


7792 


ctggcgtttt 


tccataggct 


ccgcccccct 


gacgagcatc 


acaaaaatcg 


acgctcaagt 


7852 


cagaggtggc 


gaaacccgac 


aggactataa 


agataccagg 


cgtttccccc 


tggaagctcc 


7912 


ctcgtgcgct 


ctcctgttcc 


gaccctgccg 


cttaccggat 


acctgtccgc 


ctttctccct 


7972 


tcgggaagcg 


tggcgctttc 


tcaatgctca 


cgctgtaggt 


atctcagttc 


ggtgtaggtc 


B032 


gttcgctcca 


agctgggctg 


tgtgcacgaa 


ccccccgttc 


agcccgaccg 


ctgcgcctta 


8092 


tccggtaact 


atcgtcttga 


gtccaacccg 


gtaagacacg acttatcgcc 


actggcagca 


8152 


gccactggta 


acaggattag 


cagagcgagg 


tatgtaggcg 


gtgctacaga 


gttcttgaag 


8212 


tggtggccta 


actacggcta 


cactagaagg 


acagtatttg 


gtatctgcgc 


tctgctgaag 


8272 


ccagttacct 


tcggaaaaag 


agttggtagc 


tcttgatccg gcaaacaaac 


caccgctggt 


8332 


agcggtggtt 


tttttgtttg 


caagcagcag 


attacgcgca gaaaaaaagg 


atctcaagaa 


8392 


gatcctttga 


tcttttctac 


ggggtctgac 


gctcagtgga 


acgaaaactc 


acgttaaggg 


8452 


attttggtca 


tgagattatc 


aaaaaggatc 


ttcacctaga 


tccttttaaa 


ttaaaaatga 


8512 


agttttaaat 


caatctaaag 


tatatatgag 


taaacttggt 


ctgacagtta 


ccaatgctta 


8572 


atcagtgagg 


cacctatctc 


agcgatctgt 


ctatttcgtt 


catccatagt 


tgcctgactc 


8632 


cccgtcgtgt 


agataactac 


gatacgggag 


ggcttaccat 


ctggccccag 


tgctgcaatg 


8692 


ataccgcgag 


acccacgctc 


accggctcca 


gatttatcag caataaacca 


gccagccgga 


8752 


^999ccgagc 


gcagaagtgg 


tcctgcaact 


ttatccgcct 


ccatccagtc 


tattaattgt 


8812 


tgccgggaag 


ctagagtaag 


tagttcgcca 


gttaatagtt 


tgcgcaacgt 


tgttgccatt 


8872 


gct^caggca 


tcgtggtgtc 


acgctcgtcg 


tttggtatgg 


cttcattcag 


ctccggttcc 


8932 


caacgatcaa 


ggcgagttac 


atgatccccc 


atgttgtgca 


aaaaagcggt 


tagctccttc 


8992 


ggtcctccga 


tcgttgtcag 


aagtaagttg 


gccgcagtgt 


tatcactcat 


ggttatggca 


9052 


gcactgcata 


attctcttac 


tgtcatgcca 


tccgtaagat gcttttctgt 


gactggtgag 


9112 
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tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 9172 
tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 9232 
cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 9292 
cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga 9352 
gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 9412 
atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 9472 
agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 9532 
ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 9592 
aataggcgta tcacgaggcc ctttcgtc 9620 



<210> 2 
<211> 1771 
<212> PRT 

<213> Hepatitis C virus 
<220> 

<223> Description of Artificial Sequence: Hepatitis C pns345 
<400> 2 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cye Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
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165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr lie Glu Gin 
465 470 475 480 
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Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His 

675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys 

740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp TVrg Gly Asp Gly lie 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 
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Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
605 810 815 

Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 
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Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glii Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 
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Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg CyB Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 
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Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 3 

<211> 9620 

<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (1990) . . (7302) 
<220> 

<223> Description of Artificial Sequence: pDeltaNS3NS5 



<400> 3 
cgcgcgtttc 


ggtgatgacg 


gtgaaaacct 


ctgacacatg 


cagctcccgg 


agacggtcac 


60 


agcttgtctg 


taagcggatg 


ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 


cggggctggc 


ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


180 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg 


ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


540 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 


cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 


tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


ccgttgacgc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagctcgt 


ttagtgaacc 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacct 


ccatagaaga 


caccgggacc 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaacgcg 


gattccccgt 


gccaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acaccccttt 


ggctcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctccttatgc 


tataggtgat 


ggtatagctt 


1260 
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agcctatagg tgtgggttat tgaccattat tgaccactcc cctattggtg acgatacttt 1320 

ccattactaa tccataacat ggctctttgc cacaactatc tctattggct atatgccaat 1380 

actctgtcct tcagagactg acacggactc tgtattttta caggatgggg tccatttatt 1440 

atttacaaat tcacatatac aacaacgccg tcccccgtgc ccgcagtttt tattaaacat 1500 

agcgtgggat ctccgacatc tcgggtacgt gttccggaca tgggctcttc tccggtagcg 1560 

gcggagcttc cacatccgag ccctggtccc atccgtccag cggctcatgg tcgctcggca 1620 

gctccttgct cctaacagtg gaggccagac ttaggcacag cacaatgccc accaccacca 1680 

gtgtgccgca caaggccgtg gcggtagggt atgtgtctga aaatgagctc ggagattggg 1740 

ctcgcacctg gacgcagatg gaagacttaa ggcagcggca gaagaagatg caggcagctg 1800 

agttgttgta ttctgataag agtcagaggt aactcccgtt gcggtgctgt taacggtgga 1860 

gggcagtgta gtctgagcag tactcgttgc tgccgcgcgc gccaccagac ataatagctg 1920 

acagactaac agactgttcc tttccatggg tcttttctgc agtcaccgtc gtcgacctaa 1980 

gaattcacc atg get gca tat gca get eag gge tat aag gtg eta gta ete 2031 
Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 
15 10 

aac ccc tct gtt get gea aca ctg gge ttt ggt get tac atg tec aag 2079 
Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 
15 20 25 30 

get cat ggg ate gat cet aac ate agg acc ggg gtg aga aca att acc 212 7 
Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr 
35 40 45 

«ict gge age eec ate aeg tae tec aee tae gge aag tte ett gee gae 2175 
Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp 
50 55 60 

gge ggg tgc teg ggg gge get tat gae ata ata att tgt gae gag tge 2223 
Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys 
65 70 75 

eac tec aeg gat gee aca tee ate ttg gge att gge act gte ett gae 2271 
His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp 
80 85 90 

eaa gea gag act geg ggg geg aga ctg gtt gtg ete gee aee gee acc 2319 
Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr 
95 100 105 110 

cet ccg gge tec gte act gtg ccc eat ccc aac ate gag gag gtt get 2367 
Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala 
115 120 125 

ctg tee acc aee gga gag ate cet ttt tae gge aag get ate ccc etc 2415 
Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu 
130 135 140 
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gaa gta ate aag ggg ggg aga cat etc ate tte tgt eat tea aag aag 2463 
Glu Val lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys 
145 150 155 

aag tgc gae gaa etc gee gea aag ctg gte gca ttg ggc ate aat gee 2511 
Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala 
160 165 170 

gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ccg ace age ggc 2559 
Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly 
175 180 185 190 

gat gtt gte gtc gtg gca ace gat gcc etc atg ace ggc tat ace ggc 2607 
Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly 
195 200 205 

gac tte gae teg gtg ata gac tgc aat aeg tgt gte ace cag aca gtc 2655 
Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val 
210 215 220 

gat tte age ctt gac cet ace tte ace att gag aca ate aeg etc ccc 2703 
Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro 
225 230 235 

caa gat get gtc tec cgc act caa cgt egg ggc agg act ggc agg ggg 2751 
Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly 
240 245 250 

aag eca ggc ate tac aga ttt gtg gca ccg ggg gag cgc ccc tee ggc 2799 
Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly 
255 260 265 270 

atg tte gae teg tec gtc etc tgt gag tgc tat gae gea ggc tgt get 2847 
Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala 
275 280 285 

tgg tat gag etc aeg ccc gcc gag act aca gtt agg eta ega gcg tac 2895 
Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr 
290 295 300 

atg aac ace ccg ggg ctt ccc gtg tgc cag gae eat ctt gaa ttt tgg 2943 
Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp 

305 310 315 

gag ggc gte ttt aca ggc etc act eat ata gat gee eac ttt eta tee 2991 
Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser 
320 325 330 

cag aca aag cag agt ggg gag aac ctt cet tac ctg gta gcg tac caa 3039 
Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin 
335 340 345 350 

gee ace gtg tgc get agg get caa gee cet ccc eca teg tgg gac cag 3087 
Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin 
355 360 365 

atg tgg aag tgt ttg att cgc etc aag ccc ace etc cat ggg eca aca 3135 
Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr 
370 375 380 
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ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate acc ctg acg 3183 
Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr 
385 390 395 

cac cca gtc acc aaa tac ate atg aca tge atg teg gee gac ctg gag 3231 
His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala Asp Leu Glu 
400 405 410 

gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg get get ttg 3279 
Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu 
415 420 425 430 

gcc gcg tat tgc ctg tea aea ggc tge gtg gtc ata gtg ggc agg gtc 3327 
Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg Val 
435 440 445 

gtc ttg tec ggg aag ccg gca ate ata cct gac agg gaa gtc etc tac 3375 
Val Leu Ser Gly Lys Pro Ala lie He Pro Asp Arg Glu Val Leu Tyr 
450 455 460 

cga gag ttc gat gag atg gaa gag tgc tet cag cac tta ccg tac ate 3423 
Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr lie 
465 470 475 

gag caa ggg atg atg etc gcc gag cag ttc aag cag aag gee etc ggc 3471 
Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly 
480 465 490 

etc ctg cag acc gcg tec egt cag gca gag gtt ate gcc cct get gtc 3519 
Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val 
495 500 505 510 

cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag eat atg tgg 3567 
Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp 
515 520 525 

aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea acg ctg cct 3615 
Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro 
530 535 540 

ggt aac ccc gee att get tea ttg atg get ttt aca get get gtc acc 3663 
Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr 
545 550 555 

age cca eta acc act age caa acc etc etc ttc aac ata ttg ggg ggg 3711 
Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly 
560 565 570 

tgg gtg get gcc cag etc gcc gee ccc ggt gcc get act gcc ttt gtg 3759 
Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val 
575 580 585 590 

ggc get ggc tta get ggc gee gee ate ggc agt gtt gga ctg ggg aag 3807 
Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys 
595 600 605 

gtc etc ata gac ate ctt gca ggg tat ggc gcg ggc gtg gcg gga get 3855 
Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
610 615 620 
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ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec acg gag gac 3903 
Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp 
625 630 635 

ctg gtc aat eta ctg ccc gee ate etc teg ccc gga gee etc gta gtc 3951 
Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val 
640 645 650 

ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc ccg ggc gag 3999 
Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu 
655 660 665 670 

ggg gca gtg cag tgg atg aac egg ctg ata gee ttc gee tec egg ggg 4047 
Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly 
675 680 685 

aac cat gtt tec ccc acg cac tac gtg ccg gag age gat gca get gee 4095 
Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala 
690 695 700 

cgc gtc act gcc ata etc age age etc act gta acc cag etc ctg agg 4143 
Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg 
705 710 715 

ega ctg cac cag tgg ata age teg gag tgt acc act eca tge tee ggt 4191 
Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly 
720 725 730 

tec tgg eta agg gac ate tgg gac tgg ata tge gag gtg ttg age gac 4239 
Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp 
735 740 745 750 

ttt aag acc tgg eta aaa get aag etc atg eca cag ctg cct ggg ate 4287 
Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He 
755 760 765 

ccc ttt gtg tec tge cag cgc ggg tat aag ggg gtc tgg ega ggg gac 4335 
Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp 
770 775 780 

ggc ate atg cac act cgc tge cac tgt gga get gag ate act gga cat 4383 
Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His 
785 790 795 

gtc aaa aac ggg acg atg agg ate gtc ggt cct agg acc tge agg aac 4431 
Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn 
800 805 810 

atg tgg agt ggg acc ttc ccc att aat gee tac acc acg ggc ccc tgt 4479 
Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys 
815 820 825 830 

acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg agg gtg tct 4527 
Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser 
835 840 845 

gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc cac tac gtg 4575 
Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val 
850 855 860 
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acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag gtc cca teg 4623 
Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser 
865 870 875 

ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat agg ttt gcg 4671 
Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala 
880 885 890 

ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc aga gta gga 4719 
Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly 
895 900 905 910 

etc cac gaa tac ccg gta ggg teg caa tta ect tgc gag ccc gaa ccg 4767 
Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro 
915 920 925 

gac gtg gee gtg ttg acg tec atg etc act gat ccc tec cat ata aca 4 815 
Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr 
930 935 940 

gca gag gcg gee ggg cga agg ttg gcg agg gga tea ccc ccc tct gtg 4863 
Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val 
945 950 955 

gcc age tee teg get age eag eta tec get eea tct etc aag gca act 4911 
Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr 
960 965 970 

tgc ace get aac cat gac tec cet gat get gag etc ata gag gee aac 4959 
Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn 
975 980 985 990 

etc eta tgg agg cag gag atg ggc ggc aac ate ace agg gtt gag tea 5007 
Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser 
995 1000 1005 

gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt gtg gcg gag 5055 
Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu 
1010 1015 1020 

gag gac gag egg gag ate tec gta ccc gca gaa ate ctg egg aag tct 5103 
Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser 
1025 1030 1035 

egg aga ttc gee cag gcc ctg ccc gtt tgg gcg egg ccg gac tat aac 5151 
Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn 
1040 1045 1050 

ccc ccg eta gtg gag aeg tgg aaa aag ccc gae tac gaa eea cet gtg 5199 
Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val 
1055 1060 1065 1070 

gtc cat ggc tgc ccg ctt cca ect cca aag tec cet ect gtg cet ccg 5247 
Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro 
1075 1080 1085 

cet egg aag aag egg acg gtg gtc etc act gaa tea ace eta tct act 5295 
Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr 
1090 1095 1100 
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gcc ttg gcc gag etc gcc acc aga age ttt gge age tec tea act tec 5343 ■ 
Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser 
1105 1110 1115 

ggc att acg ggc gac aat aeg aca aca tec tct gag cec gcc cct tct 5391 
Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser 
1120 1125 1130 

ggc tgc ccc ccc gac tec gac get gag tec tat tec tec atg ccc ccc 5439 
Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1135 1140 1145 1150 

ctg gag ggg gag eet ggg gat cog gat ctt age gac ggg tea tgg tea 5487 
Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser 
1155 1160 1165 

acg gtc agt agt gag gee aac geg gag gat gte gtg tgc tgc tea atg 5535 
Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met 
1170 1175 1180 

tct tac tct tgg aca ggc gca etc gte ace ceg tgc gcc gcg gaa gaa 5583 
Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu 
1185 1190 1195 

eag aaa ctg cec ate aat gca eta age aac teg ttg eta egt eac cac 5631 
Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His 
1200 1205 1210 

aat ttg gtg tat tec acc acc tea cge agt get tgc caa agg cag aag 5679 
Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys 
1215 1220 1225 1230 

aaa gte aca ttt gac aga ctg caa gtt ctg gac age eat tac cag gac 5727 
Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp 
1235 1240 1245 

gta etc aag gag gtt aaa gea geg geg tea aaa gtg aag get aac ttg 5775 
Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu 
1250 1255 1260 

eta tec gta gag gaa get tgc age ctg acg ccc cea cac tea gee aaa 5823 
Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys 

1265 1270 1275 

tec aag ttt ggt tat ggg gca aaa gac gtc egt tgc eat gee aga aag 5871 
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys 
1260 1285 1290 

gee gta ace eac ate aac tec gtg tgg aaa gac ett ctg gaa gac aat 5919 
Ala Val Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn 
1295 1300 1305 1310 

gta aca cea ata gac act acc ate atg get aag aac gag gtt ttc tgc 5967 
Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys 
1315 1320 1325 

gtt eag cct gag aag ggg ggt egt aag cea get egt etc ate gtg ttc 6015 
Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe 
1330 1335 1340 
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ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg tac gac gtg 6063 
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val 
1345 1350 1355 

gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac gga ttc caa 6111 
Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin 
1360 1365 1370 

tac tea cca gga cag egg gtt gaa ttc etc gtg caa gcg tgg aag tec 6159 
Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser 
1375 1380 1385 1390 

aag aaa ace cca atg ggg ttc teg tat gat ace cgc tgc ttt gac tec 6207 
Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser 
1395 1400 1405 

aca gtc act gag age gac ate cgt acg gag gag gca ate tac caa tgt 6255 
Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys 
1410 1415 1420 

tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec etc acc gag 6303 
Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu 
1425 1430 1435 

agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg gag aac tgc 6351 
Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys 
1440 1445 1450 

ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act age tgt ggt 6399 
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly 
1455 1460 1465 1470 

aac acc etc act tgc tac ate aag gcc egg gca gcc tgt ega gcc gca 6447 
Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala 
1475 1480 1485 

ggg etc cag gac tgc ace atg etc gtg tgt ggc gac gac tta gtc gtt 6495 
Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val 
1490 1495 1500 

ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age ctg aga gcc 6543 
lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala 
1505 1510 1515 

ttc acg gag get atg acc agg tac tec gcc ccc cct ggg gac ccc cca 6591 
Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro 
1520 1525 1530 

caa cca gaa tac gac ttg gag etc ata aca tea tgc tec tec aac gtg 6639 
Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val 
1535 1540 1545 1550 

tea gtc gcc cac gac ggc get gga aag agg gtc tac tac etc acc cgt 6687 
Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg 
1555 1560 1565 

gac cct aca acc ccc etc gcg aga get gcg tgg gag aca gca aga cac 6735 
Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His 
1570 1575 1580 
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act cca gtc aat tec tgg eta gge aae ata ate atg ttt gee eee aea 6783 
Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr 
1585 1590 1595 

etg tgg geg agg atg ata ctg atg acc cat ttc ttt age gtc ctt ata 6831 
Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu lie 
1600 1605 1610 

gcc agg gac cag ctt gaa cag gee etc gat tgc gag ate tac ggg gee 6879 
Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala 
1615 1620 1625 1630 

tgc tac tec ata gaa eca ctg gat eta cct eea ate att caa aga etc 6927 
Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu 
1635 1640 1645 

cat gge etc age gea ttt tea etc cac agt tac tet cca ggt gaa ate 6975 
His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie 
1650 1655 1660 

aat agg gtg gcc gea tgc etc aga aaa ctt ggg gta ccg ecc ttg cga 7023 
Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg 
1665 1670 1675 

get tgg aga eae egg gee egg age gte ege get agg ctt ctg gee aga 7071 
Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg 
1680 1685 1690 

gga gge agg get gee ata tgt gge aag tac etc ttc aac tgg gea gta 7119 
Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val 
1695 1700 1705 1710 

aga aea aag etc aaa etc act eca ata geg gee get gge cag ctg gac 7167 
Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp 
1715 1720 1725 

ttg tee gge tgg ttc acg get gge tac age ggg gga gac att tat cac 7215 
Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His 
1730 1735 1740 

age gtg tct cat gcc egg eee ege tgg ate tgg ttt tgc eta etc ctg 7263 
Ser Val Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu 
1745 1750 1755 



ctt get gea ggg gta gge ate tac etc etc eee aae cga tgaaggttgg 7312 
Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 



1760 




1765 


1770 






ggtaaacact 


ccggcctaaa 


aaaaaaaaaa 


aatctagaaa 


ggcgegeeaa 


gatatcaagg 


7372 


atccactaeg 


cgttagagct 


cgctgatcag 


cetcgactgt 


gcettctagt 


tgccagccat 


7432 


etgttgtttg 


cecctcccce 


gtgecttect 


tgacectgga 


aggtgecact 


eccactgtce 


7492 


tttcctaata 


aaatgaggaa 


attgeatege 


attgtetgag 


taggtgtcat 


tetattetgg 


7552 


ggggtggggt 


ggggcaggac 


ageaaggggg 


aggattggga 


agaeaatage 


aggcatgctg 


7612 


gggagetctt 


cegcttcete 


geteaetgac 


tcgetgeget 


eggtegttcg 


gctgcggcga 


7672 
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gcggtatcag 


ctcactcaaa 


ggcggtaata 


cggttatcca 


cagaatcagg ggataacgca 


7732 


ggaaagaaca 


tgtgagcaaa 


aggccagcaa 


aaggccagga 


accgtaaaaa 




7792 


ctggcgtttt 


tccataggct 


ccgcccccct 


gacgagcatc 


acaaaaatcg 


acac tea Aat" 


7852 


cagaggtggc 


gaaacccgac 


aggactataa 


agataccagg 


cgtttccccc 


t aaaaac t cc 


7912 


ctcgtgcgct 


ctcctgttcc 


gaccctgccg 


cttaccggat 


acc^g^ccgc 


www \^ w\# w \« w 


7972 


tcgggaagcg 


tggcgctttc 


tcaatgctca 


cgctgtaggt 


atctcagttc 




8032 


gttcgctcca 


agctgggctg 


tgtgcacgaa 


ccccccgttc 


aQCccQaccQ 


ctgcgcctta 


8092 


tccggtaact 


atcgtcttga 


gtccaacccg 


gtaagacacg 


acttialicgcc 


a c ti aac aci c a 


8152 


gccactggta 


acaggattag 


cagagcgagg 


tatgtaggcg 


gtgctacaga 


gt tct tgaag 


8212 


tggtggccta 


actacggcta 


cactagaagg 


acagtatttg 


gtat.ctgcgc 


tctactaaaa 

W W =3 W j3 


8272 


ccagttacct 


tcggaaaaag 


agttggtagc 


tcttgatccg 


gcaaacaaac 


caccactaot 


8332 


agcggtggtt 


tttttgtttg 


caagcagcag 


attacgcgca 


gaaaaaaagg 


atctcaaaaa 


8392 


gatcctttga 


tcttttctac 


ggggtctgac 


gctcagtgga 


acgaaaactc 


accittaaciaci 


8452 


attttggtca 


tgagattatc 


aaaaaggatc 


ttcacctaga 


tec tt^taaa 

W \^ WWW W V* v& v% 


ttiaaaaataa 


8512 


agttttaaat 


caatctaaag 


tatatatgag 


taaacttggt 


^ ^ ^* ^3 ^ W 


ccaatactta 

^ ^ vice v«4 %v W ^ Ci 


8572 


atcagtgagg 


cacctatctc 


agcgatctgt 


ctatttcgtt 


\^ C* w W W w 


tacctaactc 


8632 


cccgtcgtgt 


agataactac 


gatacgggag 


ggcttaccat 




tiactiacaata 


8692 


ataccgcgag 


acccacgctc 


accggctcca 


gatttatcag 






8752 


agggccgagc 


gcagaagtgg 


tcctgcaact 


ttatccgcct 




tattaattat 


8812 


tgccgggaag 


ctagagtaag 


tagttcgcca 


gttaatagtt 




tattaceatt 


8872 


gctacaggca 


tcgtggtgtc 


acgctcgtcg 


tttggtatgg 


efcfceattcaa 




8932 


caacgatcaa 


ggcgagttac 


atgatccccc 


atgttgtgca 


aaaaaacaat: 




8992 


ggtcctccga 


tcgttgtcag 


aagtaagttg 


gccgcagtgt 


fc.aticactcat. 

w w w^^^& w 


aatitatiaaca 

^J^J W WCK W^^ 


9052 


gcactgcata 


attctcttac 


tgtcatgcca 


tccgtaagat 


gcttttctgt 


aac t aat^aaa 


9112 


tactcaacca 


agtcattctg 


agaatagtgt 


atgcggcgac 


cgagttgctc 


ttgcccggcg 


9172 


tcaatacggg 


ataataccgc 


gccacatagc 


agaactttaa 


aagtgctcat 


cattggaaaa 


9232 


cgttcttcgg 


ggcgaaaact 


ctcaaggatc 


ttaccgctgt 


tgagatccag 


ttcgatgtaa 


9292 


cccactcgtg 


cacccaactg 


atcttcagca 


tcttttactt 


tcaccagcgt 


ttctgggtga 


9352 


gcaaaaacag 


gaaggcaaaa 


tgccgcaaaa 


aagggaataa 


gggcgacacg gaaatgttga 


9412 


atactcatac 


tcttcctttt 


tcaatattat 


tgaagcattt 


atcagggtta 


ttgtctcatg 


9472 
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agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 9532 
ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 9592 
aataggcgta tcacgaggcc ctttcgtc 9620 



<210> 4 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 

<400> 4 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
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245 



2S0 



255 



Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 

275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Net Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 



Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 
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Leu Thr Thr Ser Gin Thr Leu Leu Phe Aan lie Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 



Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 
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Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 690 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met lieu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 
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Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His Hie Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 



Phe Gly Tyr Gly Ala Lys Asp Val 
1285 

Thr His lie Asn Ser Val Trp 
1300 

Pro lie Asp Thr Thr He Met Ala 
1315 1320 



Arg Cys His Ala Arg Lys Ala Val 
1290 1295 



Lys Asn Glu Val Phe Cys Val Gin 
1325 



Lys Asp Leu Leu Glu Asp Asn Val Thr 
1305 1310 



Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 
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Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 5 
<211> 4282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pCNVII 
<400> 5 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
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cagcttgtct 


gtaagcggat 


gccgggagca 


gacaagcccg 


tcagggcgcg 


tcagcgggtg 


120 


ttggcgggtg 


tcggggctgg 


cttaactatg 


cggcatcaga 


gcagattgta 


ctgagagtgc 


180 


accatatgaa 


gctttttgca 


aaagcctagg 


cctccaaaaa 


agcctcctca 


ctacttctgg 


240 


aatagctcag 


aggccgaggc 


ggcctcggcc 


tctgcataaa 


taaaaaaaat 


tagtcagcca 


300 


tggggcggag 


aatgggcgga 


actgggcggg 


gagggaatta 


ttggctattg 


gccattgcat 


360 


acgtCgtatc 


tatatcataa 


tatgtacatt 


tatattggct 


catgtccaat 


atgaccgcca 


420 


tgttgacatt 


gattattgac 


tagttattaa 


tagtaatcaa 


ttacggggtc 


attagttcat 


480 


agcccatata 


tggagttccg 


cgttacataa 


cttacggtaa 


atggcccgcc 


^99ctgaccg 


540 


cccaacgacc 


cccgcccatt 


gacgtcaata 


atgacgtatg 


ttcccatagt 


aacgccaata 


600 


gggactttcc 


attgacgtca 


atgggtggag 


tatttacggt 


aaactgccca 


cttggcagta 


660 


catcaagtgt 


atcatatgcc 


aagtccgccc 


cctattgacg 


tcaatgacgg 


taaatggccc 


720 


gcctggcatt 


atgcccagta 


catgacctta 


cgggactttc 


ctacttggca 


gtacatctac 


780 


gtattagtca 


tcgctattac 


catggtgatg 


cggttttggc 


agtacaccaa 


tgggcgtgga 


840 


tagcggtttg 


actcacgggg 


atttccaagt 


ctccacccca 


ttgacgtcaa 


tgggagtttg 


900 


ttttggcacc 


aaaatcaacg 


ggactttcca 


aaatgtcgta 


ataaccccgc 


cccgttgacg 


960 


caaatgggcg 


gtaggcgtgt 


acggtgggag 


gtctatataa 


gcagagctcg 


tttagtgaac 


1020 


cgtcagatcg 


cctggagacg 


ccatccacgc 


tgttttgacc 


tccatagaag 


acaccgggac 


1080 


cgatccagcc 


tccgcggccg 


ggaacggtgc 


attggaacgc 


ggattccccg 


tgccaagagt 


1140 


gacgtaagta 


ccgcctatag 


actctatagg 


cacacccctt 


tggctcttat 


gcatgctata 


1200 


ctgtttttgg 


cttggggcct 


atacaccccc 


gcttccttat 


gctataggtg 


atggtatagc 


1260 


ttagcctata 


ggtgtgggtt 


attgaccatt 


attgaccact 


cccctattgg 


tgacgatact 


1320 


ttccattact 


aatccataac 


atggctcttt 


gccacaacta 


tctctattgg 


ctatatgcca 


1380 


atactctgtc 


cttcagagac 


tgacacggac 


tctgtatttt 


tacaggatgg 


ggtcccattt 


1440 


attatttaca 


aattcacata 


tacaacaacg 


ccgtcccccg 


tgcccgcagt 


ttttattaaa 


1500 


catagcgtgg 


gatctccacg 


cgaatctcgg 


gtacgtgttc 


cggacatggg 


ctcttctccg 


1560 


gtagcggcgg 


agcttccaca 


tccgagccct 


ggtcccatgc 


ctccagcggc 


tcatggtcgc 


1620 


tcggcagctc 


cttgctccta 


acagtggagg 


ccagacttag 


gcacagcaca 


atgcccacca 


1680 


ccaccagtgt 


gccgcacaag 


gccgtggcgg 


tagggtatgt 


gtctgaaaat 


gagctcggag 


1740 


attgggctcg 


caccgctgac 


gcagatggaa 


gacttaaggc 


agcggcagaa 


gaagatgcag 


1800 


gcagctgagt 


tgttgtattc 


tgataagagt 


cagaggtaac 


tcccgttgcg 


gtgctgttaa 


1860 
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cggtggaggg 


cagtgtagtc 


tgagcagtac 


tcgttgctgc 


cgcgcgcgcc 


accagacata 


1920 


atagctgaca 


gactaacaga 


ctgttccttt 


ccatgggtct 


tttctgcagt 


caccgtcgtc 


1980 


gacctaagaa 


ttcagactcg 


agcaagtcta 


gaaaggcgcg 


ccaagatatc 


aaggatccac 


2040 


tacgcgttag 


agctcgctga 


tcagcctcga 


ctgtgccttc 


tagttgccag 


ccatctgttg 


2100 


tttgcccctc 


ccccgtgcct 


tccttgaccc 


tggaaggtgc 


cactcccact 


gtcctttcct 


2160 


aataaaatga 


ggaaattgca 


tcgcattgtc 


tgagtaggtg 


tcattctatt 


ctggggggtg 


2220 


gggtggggca 


ggacagcaag 


ggggaggatt 


gggaagacaa 


tagcaggcat 


gctggggagc 


2280 


tcttccgctt 


cctcgctcac 


tgactcgctg 


cgctcggtcg 


ttcggctgcg 


gcgagcggta 


2340 


tcagctcact 


caaaggcggt 


aatacggtta 


tccacagaat 


caggggataa 


cgcaggaaag 


2400 


aacatgtgag 


caaaaggcca 


gcaaaaggcc 


aggaaccgta 


aaaaggccgc 


gttgctggcg 


2460 


tttttccata 


ggctccgccc 


ccctgacgag 


catcacaaaa 


atcgacgctc 


aagtcagagg 


2520 


tggcgaaacG 


cgacaggact 


ataaagatac 


caggcgtttc 


cccctggaag 


ctccctcgtg 


2580 


cgctctcctg 


ttccgaccct 


gccgcttacc 


ggatacctgt 


ccgcctttct 


cccttcggga 


2640 


agcgtggcgc 


tttctcaatg 


ctcacgctgt 


aggtatctca 


gttcggtgta 


ggtcgttcgc 


2700 


tccaagctgg 


gctgtgtgca 


cgaacccccc 


gttcagcccg 


accgctgcgc 


cttatccggt 


2760 


aactatcgtc 


ttgagtccaa 


cccggtaaga 


cacgacttat 


cgccactggc 


agcagccact 


2820 


ggtaacagga 


ttagcagagc 


gaggtatgta 


ggcggtgcta 


cagagttctt 


gaagtggtgg 


2880 


cctaactacg 


gctacactag 


aaggacagta 


tttggtatct 


gcgctctgct 


gaagccagtt 


2940 


accttcggaa 


aaagagttgg 


tagctcttga 


tccggcaaac 


aaaccaccgc 


tggtagcggt 


3000 


ggtttttttg 


tttgcaagca 


gcagattacg 


cgcagaaaaa 


aaggatctca 


agaagatcct 


3060 


ttgatctttt 


ctacggggtc 


tgacgctcag 


tggaacgaaa 


actcacgtta 


^999dttttg 


3120 


gtcatgagat 


tatcaaaaag 


gatcttcacc 


tagatccttt 


taaattaaaa 


atgaagtttt 


3180 


aaatcaatct 


aaagtatata 


tgagtaaact 


tggtctgaca 


gttaccaatg 


cttaatcagt 


3240 


gaggcaccta 


tctcagcgat 


ctgtctattt 


cgttcatcca 


tagttgcctg 


actccccgtc 


3300 


gtgtagataa 


ctacgatacg 


ggagggctta 


ccatctggcc 


ccagtgctgc 


aatgataccg 


3360 


cgagacccac 


gctcaccggc 


tccagattta 


tcagcaataa 


accagccagc 


cggaagggcc 


3420 


gagcgcagaa 


gtggtcctgc 


aactttatcc 


gcctccatcc 


agtctattaa 


ttgttgccgg 


3480 


gaagctagag 


taagtagttc 


gccagttaat 


agtttgcgca 


acgttgttgc 


cattgctaca 


3540 


ggcatcgtgg 


tgtcacgctc 


gtcgtttggt 


atggcttcat 


tcagctccgg 


ttcccaacga 


3600 


tcaaggcgag 


ttacatgatc 


ccccatgttg 


tgcaaaaaag 


cggttagctc 


cttcggtcct 


3660 
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ccgatcgttg 


tcagaagtaa gttggccgca gtgttatcac 


tcatggttat 


ggcagcactg 


3720 


cataattctc 


ttactgtcat 


gccatccgta 


agatgctttt 


ctgtgactgg 


tgagtactca 


3780 


accaagtcat 


tctgagaata 


gtgtatgcgg 


cgaccgagtt 


gctcttgccc 


ggcgtcaata 


3840 


c999^taata 


ccgcgccaca 


tagcagaact 


ttaaaagtgc 


tcatcattgg 


aaaacgttct 


3900 


tcggggcgaa 


aactctcaag gatcttaccg 


ctgttgagat 


ccagttcgat 


gtaacccact 


3960 


cgtgcaccca 


actgatcttc 


agcatctttt 


actttcacca 


gcgtttctgg gtgagcaaaa 


4020 


acaggaaggc 


aaaatgccgc 


aaaaaaggga 


ataagggcga cacggaaatg ttgaatactc 


4080 


atactcttcc 


tttttcaata 


ttattgaagc 


atttatcagg gttattgtct 


catgagcgga 


4140 


tacatatttg 


aatgtattta 


gaaaaataaa 


caaatagggg 


ttccgcgcac 


atttccccga 


4200 


aaagtgccac 


ctgacgtcta 


agaaaccatt 


attatcatga 


cattaaccta 


taaaaatagg 


4260 


cgtatcacga 


ggccctttcg 


tc 








4282 



<210> 6 
<211> 6299 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description o£ Artificial Sequence: pNS34a 

<220> 
<221> CDS 

<222> (1990) (4047) 
<400> 6 



cgcgcgtttc 


ggtgatgacg 


gtgaaaacct 


ctgacacatg 


cagctcccgg agacggtcac 


60 


agcttgtctg 


taagcggatg 


ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 


cggggctggc 


ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


180 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


540 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 
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cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 


tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaecccgcc 


ccgttgaegc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tetatataag 


cagagctcgt 


ttagtgaaec 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacct 


ccatagaaga 


caccgggacc 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaacgcg 


gattececgt 


gccaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acacccettt 


ggctcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctccttatgc 


tataggtgat 


ggtatagctt 


1260 


agcctatagg 


tgtgggttat 


tgaccattat 


tgaccactcc 


cctattggtg 


acgatacttt 


1320 


ccattactaa 


tccataacat 


ggctctttgc 


eacaaetatc 


tetattggct 


atatgccaat 


1380 


actctgtcct 


tcagagactg 


acacggactc 


tgtattttta 


caggatgggg 


tecatttatt 


1440 


atttacaaat 


tcacatatac 


aacaacgccg 


tcccccgtgc 


ccgcagtttt 


tattaaacat 


1500 


agcgtgggat 


ctccgacatc 


tcgggtacgt 


gttccggaca 


tgggctcttc 


tccggtagcg 


1560 


gcggagcttc 


cacatccgag 


ccctggtccc 


atccgtccag 


cggctcatgg 


tegctcggca 


1620 


gctccttgct 


cctaacagtg 


gaggccagac 


ttaggcacag 


cacaatgccc 


accaccacca 


1680 


gtgtgccgca 


caaggccgtg 


gcggtagggt 


atgtgtctga 


aaatgagctc 


ggagattggg 


1740 


ctcgcacctg 


gacgcagatg 


gaagacttaa 


ggcagcggca 


gaagaagatg 


caggcagctg 


1800 


agttgttgta 


ttctgataag 


agtcagaggt 


aactcccgtt 


gcggtgctgt 


taacggtgga 


1860 


gggcagtgta 


gtctgagcag 


tactcgttgc 


tgccgcgegc 


gccaccagac 


ataatagetg 


1920 


acagactaac 


agactgttcc 


tttccatggg 


tcttttctgc 


agtcaccgtc 


gtcgacctaa 


1980 


gaattcacc atg gcg ccc 
Met Ala Pro 


ate acg gcg tac gcc cag cag aca agg ggc etc 
He Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu 


2031 



15 10 

eta ggg tgc ata ate acc age eta act ggc egg gac aaa aac caa gtg 2079 
Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
15 20 25 30 

9^9 ggt gag gtc cag att gtg tea act get gcc caa acc ttc ctg gca 2127 
Glu Gly Glu Val Gin He Val Ser Thr Ala Ala Gin Thr Phe Leu Ala 
35 40 45 

acg tgc ate aat ggg gtg tgc tgg act gtc tae eae ggg gee gga acg 2175 
Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr 
50 55 60 

agg acc ate gcg tea ece aag ggt ect gtc ate cag atg tat ace aat 2223 
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Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn 
65 70 75 

gta gac caa gac ctt gtg ggc tgg ccc get teg caa ggt acc cgc tea 2271 
Val Asp Gin Asp Leu Val Gly Trp Pro Ala Ser Gin Gly Thr Arg Ser 
80 85 90 

ttg aca ccc tgc act tgc ggc tec teg gae ctt tac ctg gtc acg agg 2319 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
95 100 105 110 

eac gee gat gtc att ccc gtg cgc egg egg ggt gat age agg ggc age 2367 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
lis 120 125 

ctg ctg teg ccc egg ccc att tec tac ttg aaa ggc tee teg ggg ggt 2415 
Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
130 135 140 

ccg ctg ttg tgc ccc gcg ggg cac gee gtg ggc ata ttt agg gee gcg 2463 
Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He Phe Arg Ala Ala 
145 150 155 

gtg tgc ace egt gga gtg get aag gcg gtg gac ttt ate cct gtg gag 2511 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu 
160 165 170 

aae eta gag aca acc atg agg tec ccg gtg ttc acg gat aac tec tct 2559 
Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
175 180 185 190 

cca cca gta gtg ccc cag age ttc cag gtg get cac etc eat get ccc 2607 
Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
195 200 205 

aca ggc age ggc aaa age acc aag gtc ccg get gea tat gea get cag 2655 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
210 215 220 

ggc tat aag gtg eta gta etc aac ccc tct gtt get gea aca ctg ggc 2703 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
225 230 235 

ttt ggt get tac atg tec aag get cat ggg ate gat cct aac ate agg 2751 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
240 245 250 

acc ggg gtg aga aca att acc act ggc age ccc ate acg tac tec acc 2799 
Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr 
255 260 265 270 

tac ggc aag ttc ctt gee gac ggc ggg tgc teg ggg ggc get tat gac 2847 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
275 280 285 

ata ata att tgt gac gag tgc cac tee acg gat gee aca tec ate ttg 2895 
He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu 
290 295 300 
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ggc att ggc act gtc ctt gac caa gca gag act gcg ggg gcg aga ctg 2943 
Gly lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
305 310 315 

gtt gtg etc gcc acc gcc acc cct ccg ggc tec gtc act gtg ccc cat 2991 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
320 325 330 

ccc aac ate gag gag gtt get ctg tec acc ace gga gag ate cct ttt 3039 
Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe 
335 340 345 350 

tac ggc aag get ate ccc etc gaa gta ate aag ggg ggg aga eat etc 3087 
Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu 
355 360 365 

ate ttc tgt cat tea aag aag aag tgc gac gaa etc gee gca aag ctg 3135 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
370 375 380 

gtc gca ttg ggc ate aat gee gtg gcc tac tac cgc ggt ctt gac gtg 3183 
Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
385 390 395 

tec gtc ate ccg acc age ggc gat gtt gtc gtc gtg gca ace gat gee 3231 
Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 
400 405 410 

etc atg ace ggc tat acc ggc gac ttc gac teg gtg ata gac tgc aat 3279 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
415 420 425 430 

aeg tgt gtc ace cag aca gtc gat ttc age ctt gac cct acc ttc acc 3327 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
435 440 445 

att gag aca ate aeg etc ccc caa gat get gtc tec cgc act caa egt 3375 
He Glu Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg 
450 455 460 

egg ggc agg act ggc agg ggg aag eca ggc ate tac aga ttt gtg gca 3423 
krg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala 
465 470 475 

ccg ggg gag cgc ccc tec ggc atg ttc gac teg tec gtc etc tgt gag 3471 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
480 485 490 

tgc tat gac gca ggc tgt get tgg tat gag etc aeg ccc gee gag act 3519 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
495 500 505 510 

aca gtt agg eta cga gcg tac atg aac acc ccg ggg ctt ccc gtg tgc 3567 
Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys 
515 520 525 

cag gac eat ctt gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat 3615 
Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His 
530 535 540 
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ata gat gcc cac ttt eta tec eag aea aag eag agt ggg gag aae ett 3663 
lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu 
545 550 555 

cct tae etg gta geg tae eaa gee aec gtg tge get agg get eaa gee 3711 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
560 565 570 

cct ecc cea teg tgg gac eag atg tgg aag tgt ttg att egc etc aag 3759 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
575 580 585 590 

ccc acc etc cat ggg cea aca ccc etg eta tac aga etg ggc get gtt 3807 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
595 600 605 

eag aat gaa ate acc etg aeg cac cea gtc ace aaa tac ate atg aca 3855 
Gin Asn Glu He Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr 
610 615 620 

tge atg teg gcc gac etg gag gtc gtc aeg age acc tgg gtg etc gtt 3903 
Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val 
625 630 635 

ggc ggc gtc etg get get ttg gee geg tat tge etg tea aea gge tge 3951 
Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys 
640 645 650 

gtg gtc ata gtg gge agg gtc gtc ttg tee ggg aag ccg gea ate ata 3999 
Val Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He 
655 660 665 670 

cct gac agg gaa gtc etc tae ega gag ttc gat gag atg gaa gag tge 4 047 
Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys 





675 




680 




685 




taggatccac 


tacgcgttag 


agctcgctga 


tcagcctcga 


ctgtgccttc 


tagttgccag 4107 


ccatctgttg 


tttgcccctc 


ccccgtgcct 


tcettgaccc 


tggaaggtgc 


cactcccact 


4167 


gtcctttect 


aataaaatga 


ggaaattgca 


tcgcattgte 


tgagtaggtg 


tcattctatt 


4227 


ctggggggtg 


gggtggggca 


ggacagcaag 


ggggciggatt 


gggaagacaa 


tagcaggcat 


4287 


getggggage 


tettccgett 


cctegetcae 


tgactegctg 


egcteggteg 


ttcggctgcg 4347 


gcgagcggta 


tcagctcact 


caaaggcggt 


aatacggtta 


tccacagaat 


caggggataa 


4407 


cgcaggaaag 


aacatgtgag 


caaaaggcca 


geaaaaggcc 


aggaacegta 


aaaaggccgc 


4467 


gttgctggcg 


tttttccata 


ggctccgccc 


ecctgacgag catcacaaaa 


atcgacgctc 


4527 


aagtcagagg 


tggegaaace 


cgacaggact 


ataaagatac 


caggcgtttc 


cccctggaag 


4587 


ctccctcgtg 


egetctcctg 


tteegaeeet 


geegettaee ggataectgt 


eegcctttct 


4647 


eectteggga 


agegtggegc 


tttetcaatg 


cteacgetgt 


aggtatctca 


gttcggtgta 


4707 


ggtegttegc 


tccaagctgg 


getgtgtgca 


egaacccccc 


gttcagcccg 


accgctgcgc 


4767 
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^ n II 

ctcacccggt: 


aactatcgtc 


ttgagtccaa cccggtaaga 


cacgacttat 


cgccactggc 


4627 


agcagccact 


ggtaacagga 


ttagcagagc gaggtatgta ggcggtgcta cagagttctt 


4887 


gaagtggtgg 


cctaactacg 


gctacactag aaggacagta 


tttggtatct 


gcgctctgct 


4947 


gaagccagtt 


accttcggaa 


aaagagttgg tagctcttga 


tccggcaaac 


aaaccaccgc 


5007 


tggtagcggt 


ggtttttttg 


tttgcaagca gcagattacg cgcagaaaaa 


aaggatctca 


5067 


agaagatcct 


ttgatctttt 


cCacggggtc tgacgctcag tggaacgaaa 


actcacgtta 


5127 


agggattttg 


gtcatgagat 


tatcaaaaag gatcttcacc 


tagatccttt 


taaattaaaa 


5187 


atgaagtttt 


aaatcaatct 


aaagtatata tgagtaaact 


tggtctgaca 


gttaccaatg 


5247 


cttaatcagt 


gaggcaccta 


tctcagcgat ctgtctattt 


cgttcatcca 


tagttgcctg 


5307 


actccccgtc 


gtgtagataa 


ctacgatacg ggagggctta 


ccatctggcc 


ccagtgctgc 


5367 


aatgataccg 


cgagacccac 


gctcaccggc tccagattta 


tcagcaataa 


accagccagc 


5427 


cggaagggcc 


gagcgcagaa 


gtggtcctgc aactttatcc 


gcctccatcc 


agtctattaa 


5487 


ttgttgccgg 


gaagctagag 


taagtagttc gccagttaat 


agtttgcgca 


acgttgttgc 


5547 


cattgctaca 


ggcatcgtgg 


tgtcacgctc gtcgtttggt 


atggcttcat 


tcagctccgg 


5607 


ttcccaacga 


tcaaggcgag 


ttacatgatc ccccatgttg tgcaaaaaag 


cggttagctc 


5667 


cttcggtcct 


ccgatcgttg 


tcagaagtaa gttggccgca 


gtgttatcac 


tcatggttat 


5727 


ggcagcactg 


cataattctc 


ttactgtcat gccatccgta 


agatgctttt 


ctgtqactqq 


5787 


tgagtactca 


accaagCcat 


tctgagaata gtgtatgcgg 


cgaccgagtt 


gctcttgccc 


5847 


ggcgtcaata 


cgggataata 


ccgcgccaca tagcagaact 


ttaaaagtgc 


tcatcattgg 


5907 


aaaacgttct 


tcggggcgaa 


aactctcaag gatcttaccg 


ctgttgagat 


ccagttcgat 


5967 


gtaacccact 


cgtgcaccca 


actgatcttc agcatctttt 


actttcacca 


gcgtttctgg 


6027 


gtgagcaaaa 


acaggaaggc 


aaaatgccgc aaaaaaggga 


ataagggcga 


cacggaaatg 


6087 


ttgaatactc 


atactcttcc 


tttttcaata ttattgaagc 


atttatcagg gttattgtct 


6147 


catgagcgga 


tacatatttg 


aatgtattta gaaaaataaa 


caaatagggg 


ttccgcgcac 


6207 


atttccccga 


aaagtgccac 


ctgacgtcta agaaaccatt 


attatcatga 


cattaaccta 


6267 


taaaaatagg 


cgtatcacga 


ggccctttcg tc 






6299 



<210> 7 
<211> 686 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Description o£ Artificial Sequence: pNS34a 
<400> 7 

Met Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 
15 10 15 

Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
20 25 30 

Glu Val Gin lie Val Ser Thr Ala Ala Gin Thr Phe Leu Ala Thr Cys 
35 40 45 

lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 
50 55 60 

lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 

Gin Asp Leu Val Gly Trp Pro Ala Ser Gin Gly Thr Arg Ser Leu Thr 
85 90 95 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 

Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 125 

Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 140 

Leu Cys Pro Ala Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys 
145 150 155 160 

Thr Arg Gly Val Ala Lys Ala Val Asp Phe lie Pro Val Glu Asn Leu 
165 170 175 

Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 

Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 

195 200 205 

Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 215 220 

Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly 
245 250 255 

Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly 
260 265 270 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie 
275 280 265 

He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser lie Leu Gly He 
290 295 300 
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Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 

325 330 335 

lie Glu Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly 
340 345 350 

Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly Arg His Leu lie Phe 
355 360 365 

Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala 
370 375 380 

Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

lie Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys 
420 425 430 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu 
435 440 445 

Thr lie Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly 
450 455 460 

Arg Thr Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly 
465 470 475 480 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val 
500 505 510 

Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 

515 520 525 

His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp 
530 535 540 

Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr 
545 550 555 560 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 
580 585 590 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
595 600 605 



Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met 
610 615 620 
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Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 

Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val 
645 650 655 

lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp 
660 665 670 

Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys 
675 680 685 

<210> 8 
<211> 19912 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pd.deltaNS3NS5 

<220> 

<221> CDS 

<222> (12745) . . (18057) 

<400> 8 



atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 
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tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agcCactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 
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gaaacatgct 


gcttaaaact 


ccaagcggta 


tattatctcc 


gcctcagttt 


gatcttccgc 


tatttcaccc 


cacaatcctt 


catccgcctc 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


tatatgacct 


ttatcctgtt 


ctctttccac 


gcacctaata 


acattcttca 


d39cggagaa 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


tcgaagataa 


gagaagaatg 


cagtgacctt 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


attaatatct 


aaaccctctc 


cgatggtggc 


aaactgtgat 


aattctgggt 


gatttatgat 


aggatcaggc 


caatccagtt 


ctttttcaat 


tccaacaaat 


gcaaatgcta 


acgttttgta 


cccccttgtc 


gtctcgatta 


cacacctact 


cataatacat 


tgcttaatac 


aagcaagcag 


cattacagct 


gatgtcattg 


tatatcagcg 


tcgcggtttt 


tataaacaaa 


actttcgtta 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


ttaacttcga 


gaagggatta 


aggctaattt 


ccattgaatg 


ccttataaaa 


cagctataga 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tgacattata 


aagctggcac 


ttagaattcc 


tctactgtac 


gatacacttc 


cgctcaggtc 


ttgttactct 


attgatccag 


ctcagcaaag 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


gctgccatca 


ttattatccg 


atgtgacgct 


tttttttttt 


tttttttttt 


ttttttggta 


agcaaggatt 


ttcttaactt 


cttcggcgac 


accacctaaa 


tcaccagttc 


tgatacctgc 


ggctttacct 


tcttcaggca 


agttcaatga 



ggagaccgat aaaggttaat aggacagccg 2880 
ttcagactgc catttttcac ataatgaatc 2940 
cgcatcttgt tccgttaaac tattgacttc 3000 
gtcctcttca ggcggtagct cctgatctcc 3060 
aaacttagaa atgtattcat gaattatgga 3120 
gtttgggcca gatgcccaat atgcttgaca 3180 
gtgatattct gaggcaattt tattataatc 3240 
tgtattgaca aatggagatt ccatgtatct 3300 
ctttcccctg cggtttagcg tgccttttac 3360 
ctttaactga ctaataaatg caaccgatat 3420 
tcgatcgaca attgtattgt acactagtgc 3480 
taccggtgtg tcgtctgtat tcagtacatg 3540 
tttcttataa ttgtcaggaa ctggaaaagt 3600 
ttcatcgtac accataggtt ggaagtgctg 3660 
tctctcgcca ttcatatttc agttattttc 3720 
ctgtaaaaat ctatctgtta cagaaggttt 3780 
cgaaatcgag caatcacccc agctgcgtat 3840 
gagttgcatt ttttacacca taatgcatga 3900 
cactagtatg tttcaaaaac ctcaatctgt 3960 
ttgcatagaa gagttagcta ctcaatgctt 4020 
tactttcagg cgggtctgta gtaaggagaa 4080 
acggactata gactatacta gtatactccg 4140 
cttgtccttt aacgaggcct taccactctt 4200 
gcagtgtgat ctaagattct atcttcgcga 4260 
actagaaatg caaaaggcac ttctacaatg 4320 
gcattttttt tttttttttt tttttttttt 4380 
caaatatcat aaaaaaagag aatcttttta 4440 
agcatcaccg acttcggtgg tactgttgga 4500 
atccaaaacc tttttaactg catcttcaat 4560 
caatttcaac atcattgcag cagacaagat 4620 
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agtggcgata gggttgacct 


tattctttgg 


gtacaaacca aatgcggtgt 


tcttgtctgg 


acccaaggag 


cctgggataa 


cggaggcttc 


ggtgattata ataccattta 


ggtgggttgg 


a 9 1^ 9 a ^ A 

aatcaaucga 


ugucgaacuu 


tcaatgtagg 




aauCu cgaag 


aggccaaaac 


tggcggctca 


cgcuguaggg 


ccatgaaagc 




f' ^ ^ 9 ^ ^ a ^ 

uguccacuau 


cccaagcgac 


*a a a/^^ a ^ ^ ^ 

oaagCaaoUcl 




attctctaac 


tggcttgatt 


ggaga^aagt 


ctaaaagaga 


ggcgcacaac 


cgaagc c c c c 


tacggatttt 


^ A ^ ^ ^ a i~ 

gguaCCCCau 


^ ^ 9 9 ^ ^ 9 ^ 


ccacagcacc 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


atgattttcg 


aaatcgaact 


tgacattgga 


aatggcttcg 


gctgtgattt 


cttgaccaac 


aggggcagac 


attacaatgg 


tatatccttg 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tatccgacaa 


actgttttac 


agatttacga 


acatccgaac 


ctgggagttt 


tccctgaaac 


tatagtctag cgctttacgg 


aagacaatgt 


atctattgca 


taggtaatct 


tgcacgtcgc 


tgcacttcaa 


tagcatatct 


ttgttaacga 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


caaaaatgca 


acgcgagagc 


gctaattttt 


gaacagaaat 


gcaacgcgag 


agcgctattt 


ttctacaaaa 


atgcatcccg 


agagcgctat 


tttctccttt 


gtgcgctcta 


taatgcagtc 


taaggttaga 


agaaggctac 


tttggtgtct 


cacttcccgc 


gtttactgat 


tactagcgaa 



caaatctgga gcggaaccat ggcatggttc 4680 
caaagaggcc aaggacgcag atggcaacaa 4740 
atcggagatg atatcaccaa acatgttgct 4 800 
gttcttaact aggatcatgg cggcagaatc 4860 
gaattcgttc ttgatggttt cctccacagt 4 920 
attagcttta tccaaggacc aaataggcaa 4980 
ggccattctt gtgattcttt gcacttctgg 5040 
accatcacca tcgtcttcct ttctcttacc 5100 
aacaacgaag tcagtacctt tagcaaattg 5160 
gtcggatgca aagttacatg gtcttaagtt 5220 
tagtaaacct tgttcaggtc taacactacc 5280 
taacaaaacg gcatcagcct tcttggaggc 5340 
tgtagcatcg atagcagcac caccaattaa 5400 
acgaacatca gaaatagctt taagaacctt 5460 
gtggtcacct ggcaaaacga cgatcttctt 5520 
aaatatatat aaaaaaaaaa aaaaaaaaaa 5580 
tcgaatacgc tttgaggaga tacagcctaa 5640 
tcgtacttgt tacccatcat tgaattttga 5700 
agatagtata tttgaacctg tataataata 5760 
atgtatttcg gttcctggag aaactattgc 5820 
atccccggtt cattttctgc gtttccatct 5880 
agcatctgtg cttcattttg tagaacaaaa 5940 
aaagaatctg agctgcattt ttacagaaca 6000 
acgaagaatc tgtgcttcat ttttgtaaaa 6060 
caaacaaaga atctgagctg catttttaca 6120 
taccaacaaa gaatctatac ttcttttttg 6180 
ttttctaaca aagcatctta gattactttt 6240 
tcttgataac tttttgcact gtaggtccgt 6300 
attttctctt ccataaaaaa agcctgactc 6360 
gctgcgggtg cattttttca agataaaggc 6420 
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atccccgatt 


atattctata 


ccgatgtgga 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


atactacgta 


taggaaatgt 


ttacattttc 


tcttactaca 


atttttttgt 


ctaaagagta 


gtcgagttta 


gatgcaagtt 


caaggagcga 


agcacagaga 


tatatagcaa 


agagatactt 


aatattttag 


tagctcgtta 


cagtccggtg 


gagcgctttt 


ggttttcaaa 


agcgctctga 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


tatacatgag 


aagaacggca 


tagtgcgtgt 


atttatgtag 


gatgaaaggt 


agtctagtac 


gtatcgtatg 


cttccttcag 


cactaccctt 


tggattagtc 


tcatccttca 


atgctatcat 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


cctggccacg 


gcagaagcac 


gcttatcgct 


taggcccttc 


attgaaagaa 


atgaggtcat 


attttttata 


gcaaagattg 


aataaggcgc 


gactaagtta 


tcttttaata 


attggtattc 


atttactcgt 


tttaggactg 


gttcagaatt 


atcgatgata 


agctgtcaaa 


catgagaatt 


tatttttata 


ggttaatgtc 


atgataataa 


9999aa^tgt 


gcgcggaacc 


cctatttgtt 


cgctcatgag 


acaataaccc 


tgataaatgc 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


tgggttacat 


cgaactggat 


ctcaacagcg 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


agtactcacc 


agtcacagaa 


aagcatctta 



ttgcgcatac tttgtgaaca gaaagtgata 6480 
tatgaacggt ttcttctatt ttgtctctat 6540 
gtattgtttt cgattcactc tatgaatagt 6600 
atactagaga taaacataaa aaatgtagag 6660 
aaggtggatg ggtaggttat atagggatat 6720 
ttgagcaatg tttgtggaag cggtattcgc 6780 
cgtttttggt tttttgaaag tgcgtcttca 6840 
agttcctata ctttctagag aataggaact 6900 
aacgagcgct tccgaaaatg caacgcgagc 6960 
acctatatct gcgtgttgcc tgtatatata 7020 
ttatgcttaa atgcgtactt atatgcgtct 7080 
ctcctgtgat attatcccat tccatgcggg 7140 
tagctgttct atatgctgcc actcctcaat 7200 
ttcctttgat attggatcat atgcatagta 7260 
tattgctgtt atctgatgag tatacgttgt 7320 
ccaatttccc acaacattag tcaactccgt 7380 
caaatgtctt ccaatgtgag attttgggcc 7440 
atttttcttc aaagctttat tgtacgatct 7500 
ctgtttattg cttgaagaat tgccggtcct 7560 
cctcaaaaat tcatccaaat atacaagtgg 7620 
cttgaagacg aaagggcctc gtgatacgcc 7680 
tggtttctta gacgtcaggt ggcacttttc 7740 
tatttttcta aatacattca aatatgtatc 7800 
ttcaataata ttgaaaaagg aagagtatga 7860 
ccttttttgc ggcattttgc cttcctgttt 7920 
aagatgctga agatcagttg ggtgcacgag 7980 
gtaagatcct tgagagtttt cgccccgaag 8040 
ttctgctatg tggcgcggta ttatcccgtg 8100 
gcatacacta ttctcagaat gacttggttg 8160 
cggatggcat gacagtaaga gaattatgca 8220 
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gtgctgccat 


aaccatgagt gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8280 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8340 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8400 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8460 


ggcaacaatt 


aatagactgg atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8520 


cccttccggc 


tggctggttt attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8580 


gtatcattgc 


agcactgggg ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8640 


cggggagtca 


ggcaactatg gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8700 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8760 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8820 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8880 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8940 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


9000 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9060 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9120 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9180 


cggataaggc 


gcagcggtcg ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9240 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc gccacgcttc 


9300 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca ggagagcgca 


9360 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg tttcgccacc 


9420 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9480 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9540 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9600 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9660 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9720 


ctctcagtac 


aatctgctct gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9780 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9840 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9900 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9960 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


10020 
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tttctccaga 


agcgttaatg 


tctggcttct gataaagcgg gccatgttaa gggcggtttt 


10080 


ttcctgtttg 


gtcactgatg 


cctccgtgta agggggattt 


ctgttcatgg gggtaatgat 


10140 


accgatgaaa 


cgagagagga 


tgctcacgat acgggttact 


gatgatgaac 




10200 


actggaacgt 


tgtgagggta 


aacaactggc ggtatggatg 


cggcgggacc 


amann a :a s £> t* 


10260 


cactcagggt 


caatgccagc 


gcttcgttaa tacagatgta ggtgttccac 


Si ncmt" a n o a 


10320 


gcagcatcct 


gcgatgcaga 


tccggaacat aatggtgcag ggcgctgact 


t> ^ 

uccgcgcu 


10380 


cagactttac 


gaaacacgga 


aaccgaagac cattcatgtt 


gttgctcagg 




10440 


tttgcagcag 


cagtcgcttc 


acgttcgctc gcgtatcggt 


gattcattct 


gctaaccagt 


10500 


aaggcaaccc 


cgccagccta 


gccgggtcct caacgacagg 


agcacgatca 


tgcgcacccg 


10560 


tggccaggac 


ccaacgctgc 


ccgagatgcg ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10620 


gatggatatg 


ttctgccaag 


ggttggtttg cgcattcaca gttctccgca 


agaattgatt 


10680 


ggctccaatt 


cttggagtgg 


tgaatccgtt agcgaggtgc 


cgccggcttc 


cattcaggtc 


10740 


gaggtggccc 


ggctccatgc 


accgcgacgc aacgcgggga ggcagacaag gtatagggcg 


10800 


gcgcctacaa 


tccatgccaa 


cccgttccat gtgctcgccg 


aggcggcata 


aatcgccgtg 


10860 


acgatcagcg 


gtccaatgat 


cgaagttagg ctggtaagag 


ccgcgagcga 


tccttgaagc 


10920 


tgtccctgat 


ggtcgtcatc 


tacctgcctg gacagcatgg 


cctgcaacgc 


gggcatcccg 


10980 


atgccgccgg 


aagcgagaag 


aatcataatg gggaaggcca 


tccagcctcg 


cgtcgcgaac 


11040 


gccagcaaga 


cgtagcccag 


cgcgtcggcc gccatgccgg cgataatggc 


ctgcttctcg 


11100 


ccgaaacgtt 


tggtggcggg 


accagtgacg aaggcttgag 


cgagggcgtg 


caagattccg 


11160 


aataccgcaa 


gcgacaggcc 


gatcatcgtc gcgctccagc gaaagcggtc 


ctcgccgaaa 


11220 


atgacccaga 


gcgctgccgg 


cacctgtcct acgagttgca 


tgataaagaa gacagtcata 


11280 


agtgcggcga 


cgatagtcat 


gccccgcgcc caccggaagg 


agctgactgg 


gttgaaggct 


11340 


ctcaagggca 


tcggtcgagg 


atccttcaat atgcgcacat 


acgctgttat gttcaaggtc 


11400 


ccttcgttta 


agaacgaaag 


cggtcttcct tttgagggat 


gtttcaagtt 


gttcaaatct 


11460 


atcaaatttg 


caaatcccca 


gtctgtatct agagcgttga 


atcggtgatg 


cgatttgtta 


11520 


attaaattga 


tggtgtcacc 


attaccaggt ctagatatac 


caatggcaaa 


ctgagcacaa 


11580 


caataccagt 


ccggatcaac 


tggcaccatc tctcccgtag 


tctcatctaa 


tttttcttcc 


11640 


ggatgaggtt 


ccagatatac 


cgcaacacct ttattatggt 


ttccctgagg gaataataga 


11700 


atgtcccatt 


cgaaatcacc 


aattctaaac ctgggcgaat 


tgtatttcgg gtttgttaac 


11760 


tcgttccagt 


caggaatgtt 


ccacgtgaag ctatcttcca 


gcaaagtctc 


cacttcttca 


11820 
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tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11880 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11940 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


12000 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttetgatg 


ccaagaactc 


taaccagtct 


12060 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12120 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12180 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12240 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12300 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12360 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


aeaacctcaa 


tggagtgatg 


12420 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12480 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12540 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12600 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12660 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12720 


acaagcttac 


aaaacaaatt 


cacc atg get gca tat gca get cag 


gge tat 


12771 



Met Ala Ala Tyr Ala Ala Gin Gly Tyr 
1 5 

aag gtg eta gta etc aac ccc tct gtt get gca aea ctg ggc ttt ggt 12819 
Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
10 15 20 25 

get tac atg tec aag get eat ggg ate gat cct aac ate agg ace ggg 12867 
Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly 
30 35 40 

gtg aga aea att ace act ggc age ccc ate aeg tac tec ace tac ggc 12915 
Val Arg Thr lie Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly 
45 50 55 

aag ttc ctt gcc gac ggc ggg tgc teg ggg ggc get tat gae at a at a 12963 
Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 
60 65 70 

att tgt gac gag tgc cae tec aeg gat gee aca tec ate ttg ggc att 13011 
lie Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He 
75 80 85 

ggc act gtc ctt gae caa gca gag act gcg ggg gcg aga ctg gtt gtg 13059 
Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
90 95 100 105 

etc gcc acc gcc acc cct ccg ggc tec gtc act gtg ccc cat ccc aac 13107 
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Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
110 115 120 

ate gag gag gtt get ctg tec ace acc gga gag ate ect ttt tac gge 13155 
lie Glu Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly 
125 130 135 

aag get ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate tte 13203 
Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly Arg His Leu lie Phe 
140 145 150 

tgt cat tea aag aag aag tge gac gaa etc gee gea aag etg gte gea 13251 
Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala 
155 160 165 

ttg ggc ate aat gee gtg gcc tac tac cgc ggt ctt gac gtg tec gte 13299 
Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
170 175 180 185 

ate ceg acc age ggc gat gtt gte gte gtg gea ace gat gcc etc atg 13347 
lie Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
190 195 200 

ace ggc tat acc ggc gac tte gac teg gtg ata gac tge aat aeg tgt 13395 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys 
205 210 215 

gte acc cag aca gte gat tte age ctt gac ect acc tte ace att gag 13443 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
220 225 230 

aca ate aeg etc ccc caa gat get gte tec cgc act caa egt egg gge 13491 
Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly 
235 240 245 

agg act ggc agg ggg aag cea gge ate tac aga ttt gtg gea ceg ggg 13539 
Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly 
250 255 260 265 

gag cgc ccc tec ggc atg tte gac teg tec gte etc tgt gag tge tat 13587 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
270 275 280 

gac gea ggc tgt get tgg tat gag etc aeg ccc gee gag act aca gtt 13635 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val 
285 290 295 

agg eta cga geg tac atg aac acc ceg ggg ctt ccc gtg tge cag gac 13683 
Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
300 305 310 

cat ctt gaa ttt tgg gag gge gte ttt aca ggc etc act cat ata gat 13731 
His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp 
315 320 325 

gcc eac ttt eta tec cag aca aag cag agt ggg gag aac ctt ect tac 13779 
Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr 
330 335 340 345 
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ctg gta gcg tac caa gcc acc gtg tgc get agg get caa gcc cct ccc 13827 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
350 355 360 

oca teg tgg gac cag atg tgg aag tgt ttg att cgc etc aag ccc acc 13875 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
365 370 375 

etc cat ggg cea aea ccc ctg eta tac aga ctg gge get gtt cag aat 13923 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
380 385 390 

gaa ate acc ctg acg cac cca gtc acc aaa tac ate atg aea tgc atg 13971 
Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met 
395 400 405 

teg gcc gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc gge 14019 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
410 415 420 425 

gtc ctg get get ttg gcc gcg tat tgc ctg tea aea ggc tgc gtg gtc 14067 
Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val 
430 435 440 

ata gtg gge agg gtc gtc ttg tec ggg aag ccg gca ate ata cct gac 14115 
He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp 
445 450 455 

agg gaa gtc etc tac cga gag tte gat gag atg gaa gag tgc tct cag 14163 
Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin 
460 465 470 

cac tta ccg tac ate gag caa ggg atg atg etc gee gag eag tte aag 14211 
His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys 
475 480 485 

cag aag gcc etc ggc etc ctg cag acc gcg tec cgt cag gca gag gtt 14259 
Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val 
490 495 500 505 

ate gcc cct get gtc cag ace aac tgg caa aaa etc gag acc tte tgg 14307 
He Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp 
510 515 520 

gcg aag cat atg tgg aac tte ate agt ggg ata caa tac ttg gcg ggc 14355 
Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
525 530 535 

ttg tea acg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt 14403 
Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 
540 545 550 

aea get get gtc acc age cca eta ace act age caa ace etc etc tte 14451 
Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe 

555 560 565 

aac ata ttg ggg ggg tgg gtg get gcc cag etc gee gee ccc ggt gee 14499 
Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala 
570 575 580 585 
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get act gcc ttt gtg ggc get gge tta get ggc gee gcc ate gge agt 14547 
Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser 
590 595 600 

gtt gga ctg ggg aag gtc etc ata gac ate ett gea ggg tat gge gcg 14595 
Val Gly Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala 
605 610 615 

ggc gtg gcg gga get ett gtg gca ttc aag ate atg age ggt gag gtc 14643 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val 
620 625 630 

ccc tec acg gag gac ctg gtc aat eta ctg cec gcc ate etc teg cec 14691 
Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
635 640 645 

gga gee etc gta gtc gge gtg gtc tgt gca gea ata ctg cgc egg cac 14739 
Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
650 655 660 665 

gtt ggc ecg gge gag ggg gea gtg eag tgg atg aae egg ctg ata gee 14787 
Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 
670 675 680 

ttc gcc tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag 14835 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
685 690 695 

age gat gca get gcc cgc gtc act gcc ata etc age age etc act gta 14883 
Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val 
700 705 710 

ace cag etc ctg agg cga ctg cac eag tgg ata age teg gag tgt ace 14931 
Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr 
715 720 725 

act cca tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc 14979 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys 
730 735 740 745 

gag gtg ttg age gac ttt aag ace tgg eta aaa get aag etc atg eea 15027 
Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro 
750 755 760 

cag ctg ect ggg ate ccc ttt gtg tec tgc cag cgc ggg tat aag ggg 15075 
Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly 
765 770 775 

gtc tgg cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get 15123 
Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala 
780 785 790 

gag ate act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt cct 15171 
Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro 
795 800 805 

agg ace tgc agg aae atg tgg agt ggg ace ttc ccc att aat gee tac 15219 
Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr 
810 815 820 825 
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acc acg ggc ccc tgt acc ccc ctt cct gcg ccg aac tac acg ttc gcg 
Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala 
830 835 840 



15267 



eta tgg agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg 
Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu lie Arg Gin Val Gly 
845 850 855 



15315 



gac ttc cac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg 
Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro 
860 865 870 



15363 



tgc cag gtc cca teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc 
Cys Gin Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg 
875 880 885 



15411 



eta cat agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta 
Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val 
890 895 900 905 



15459 



tea ttc aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct 
Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro 
910 915 920 



15507 



tgc gag ccc gaa ccg gac gtg gcc gtg ttg acg tec atg etc act gat 
Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 
925 930 935 



15555 



CCC tec cat ata aca gca gag gcg gcc ggg cga agg ttg gcg agg gga 
Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly 
940 945 950 



15603 



tea ccc ccc tct gtg gcc age tec teg get age cag eta tec get cca 
Ser Pro Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
955 960 965 



15651 



tct etc aag gca act tgc acc get aac cat gac tec cct gat get gag 
Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu 
970 975 980 985 



15699 



etc ata gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate 
Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
990 995 1000 



15747 



acc agg gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat 
Thr Arg Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp 
1005 1010 1015 



15795 



ccg ctt gtg gcg gag gag gac gag egg gag ate tec gta ccc gca gaa 
Pro Leu Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu 
1020 1025 1030 



15843 



ate ctg egg aag tct egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg 
He Leu Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala 
1035 1040 1045 



15891 



egg ccg gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac 
Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp 
1050 1055 1060 1065 



15939 
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tac gaa cca cct gtg gtc cat ggc tgc ccg ctt cca cct cca aag tec 15987 
Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser 
1070 1075 1080 

cct cct gtg cct ccg cct egg aag aag egg aeg gtg gtc etc act gaa 16035 
Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu 
1085 1090 1095 

tea ace eta tet act gee ttg gee gag etc gee ace aga age ttt ggc 16083 
Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly 
1100 1105 1110 

age tee tea act tec ggc att aeg ggc gac aat aeg aca aca tec tet 16131 
Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser 
1115 1120 1125 

gag ecc gee cct tet ggc tgc cec cce gac tee gac get gag tee tat 16179 
Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr 
1130 1135 1140 1145 

tec tec atg ecc ecc ctg gag ggg gag cct ggg gat ccg gat ctt age 16227 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
1150 1155 1160 

gac ggg tea tgg tea aeg gtc agt agt gag gee aac geg gag gat gtc 16275 
Asp Gly Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val 
1165 1170 1175 

gtg tgc tgc tea atg tet tac tet tgg aca ggc gea etc gtc ace ccg 16323 
Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro 
1180 1185 1190 

tgc gee gcg gaa gaa cag aaa ctg ecc ate aat gea eta age aac teg 16371 
Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn Ser 
1195 1200 1205 

ttg eta cgt cae cac aat ttg gtg tat tec ace acc tea egc agt get 16419 
Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala 
1210 1215 1220 1225 

tgc caa agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac 16467 
Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 
1230 1235 1240 

age eat tac cag gac gta etc aag gag gtt aaa gea gcg gcg tea aaa 16515 
Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys 
1245 1250 1255 

gtg aag get aac ttg eta tec gta gag gaa get tgc age ctg aeg ecc 16563 
Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro 
1260 1265 1270 

cca cae tea gee aaa tec aag ttt ggt tat ggg gea aaa gac gtc cgt 16611 
Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg 
1275 1280 1285 

tgc cat gee aga aag gee gta ace cae ate aac tec gtg tgg aaa gac 16659 
Cys His Ala Arg Lys Ala Val Thr His He Asn Ser Val Trp Lys Asp 
1290 1295 1300 1305 
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ctt ctg gaa gac aat gta aca cca ata gac act acc ate atg get aag 
Leu Leu Glu Asp Asn val Thr Pro lie Asp Thr Thr He Met Ala Lys 
1310 1315 1320 



16707 



aac gag gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag cca get 
Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala 
1325 1330 1335 



16755 



cgt etc ate gtg ttc cec gat ctg gge gtg cgc gtg tgc gaa aag atg 
Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 
1340 1345 1350 



16803 



get ttg tac gac gtg gtt aca aag etc cec ttg gee gtg atg gga age 
Ala Leu Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser 
1355 1360 1365 



16851 



tec tac gga ttc caa tac tea cca gga cag egg gtt gaa ttc etc gtg 
Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val 
1370 1375 1380 1385 



16899 



caa gcg tgg aag tec aag aaa acc cca atg ggg ttc teg tat gat acc 
Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr 
1390 1395 1400 



16947 



cgc tgc ttt gac tec aca gtc act gag age gac ate cgt acg gag gag 
Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu 
1405 1410 1415 



16995 



gea ate tac caa tgt tgt gac etc gac cec caa gee cgc gtg gee ate 
Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He 
1420 1425 1430 



17043 



aag tec etc acc gag agg ctt tat gtt ggg gge cct ctt ace aat tea 
Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser 
1435 1440 1445 



17091 



agg ggg gag aac tgc gge tat cgc agg tgc cgc gcg age gge gta ctg 
Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu 
1450 1455 1460 1465 



17139 



aca act age tgt ggt aac ace etc act tgc tac ate aag gee egg gea 
Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala 
1470 1475 1480 



17187 



gee tgt ega gee gea ggg etc cag gac tgc ace atg etc gtg tgt gge 
Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly 
1485 1490 1495 



17235 



gac gac tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg 
Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala 
1500 1505 1510 



17283 



gcg age ctg aga gee ttc acg gag get atg ace agg tac tee gee cec 
Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro 
1515 1520 1525 



17331 



cct ggg gac cec cca caa cca gaa tac gac ttg gag etc ata aca tea 
Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser 
1530 1535 1540 1545 



17379 
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tgc tec tec aac gtg tea gtc gcc cac gac ggc get gga aag agg gtc 
Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val 
1550 1555 1560 



17427 



tac tac etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg 
Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp 
1565 1570 1575 



17475 



gag aca gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate 
Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie 
1580 1585 1590 



17523 



atg ttt gcc ccc aca ctg tgg gcg agg atg ata ctg atg acc cat ttc 
Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe 
1595 1600 1605 



17571 



ttt age gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc 
Phe Ser Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys 
1610 1615 1620 1625 



17619 



gag ate tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cca 
Glu lie Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Pro 
1630 1635 1640 



17667 



ate att caa aga etc cat ggc etc age gca ttt tea etc cac agt tac 
lie lie Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr 
1645 1650 1655 



17715 



tct cca ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ctt ggg 
Ser Pro Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly 
1660 1665 1670 



17763 



gta eeg ccc ttg cga get tgg aga cac egg gcc egg age gtc cgc get 
Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala 
1675 1680 1685 



17811 



agg ctt ctg gee aga gga ggc agg get gcc ata tgt ggc aag tac etc 
Arg Leu Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu 
1690 1695 1700 1705 



17859 



ttc aac tgg gca gta aga aca aag etc aaa etc act cca ata gcg gee 
Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala 
1710 1715 1720 



17907 



get ggc cag ctg gac ttg tee ggc tgg ttc aeg get ggc tac age ggg 
Ala Gly Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly 
1725 1730 1735 



17955 



gga gac att tat cac age gtg tct cat gcc egg ccc cgc tgg ate tgg 
Gly Asp lie Tyr His Ser Val Ser His Ala Arg Pro Arg Trp lie Trp 
1740 1745 1750 



18003 



ttt tgc eta etc ctg ctt get gca ggg gta ggc ate tac etc etc ccc 
Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro 
1755 1760 1765 



18051 



aac ega tgaaggttgg ggtaaaeact ecggectaaa aaaaaaaaaa aatetagaac 

Asn Arg 

1770 



18107 
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ccgagtcgac 


tttgttccca 


ctgtactttt 


agctcgtaca 


aaatacaata 


tacttttcat 


18167 


ttctccgtaa 


acaacatgtt 


ttcccatgta 


atatcctttt 


ctatttttcg 


ttccgttacc 


18227 


aactttacac 


atactttata 


tagctattca 


cttctataca 


ctaaaaaact 


aagacaattt 


18287 


taattttgct 


gcctgccata 


tttcaatttg 


ttataaattc 


ctataattta 


tcctattagt 


18347 


agctaaaaaa 


agatgaatgt gaatcgaatc ctaagagaat tggatctgat 


ccacaggacg 


18407 


ggtgtggtcg 


ccatgatcgc 


gtagtcgata gtggctccaa gtagcgaagc 


gagcaggact 


18467 


9g9cg9cggc 


caaagcggtc ggacagtgct 


ccgagaacgg gtgcgcatag 


aaattgcatc 


18527 


aacgcatata 


gcgctagcag cacgccatag 


tgactggcga 


tgctgtcgga 


atggacgata 


18587 


tcccgcaaga 


ggcccggcag 


taccggcata 


accaagccta 


tgcctacagc 


atccagggtg 


18647 


acggtgccga 


ggatgacgat 


gagcgcattg 


ttagatttca 


tacacggtgc 


ctgactgcgt 


18707 


tagcaattta 


actgtgataa actaccgcat 


taaagctttt 


tctttccaat 


tttttttttt 


18767 


tcgtcattat 


aaaaatcatt 


acgaccgaga 


ttcccgggta 


ataactgata 


taattaaatt 


18827 


gaagctctaa 


tttgtgagtt tagtatacat gcatttactt ataatacagt 


tttttagttt 


18807 


tgctggccgc 


atcttctcaa atatgcttcc 


cagcctgctt 


ttctgtaacg 


ttcaccctct 


18947 


accttagcat 


cccttccctt 


tgcaaatagt 


cctcttccaa 


caataataat 


gtcagatcct 


19007 


gtagagacca 


catcatccac 


ggttctatac 


tgttgaccca 


atgcgtctcc 


cttgtcatct 


19067 


aaacccacac 


cgggtgtcat 


aatcaaccaa 


tcgtaacctt 


catctcttcc 


acccatgtct 


19127 


ctttgagcaa 


taaagccgat 


aacaaaatct 


ttgtcgctct 


tcgcaatgtc 


aacagtaccc 


19187 


ttagtatatt 


ctccagtaga 


tagggagccc 


ttgcatgaca 


attctgctaa 


catcaaaagg 


19247 


cctctaggtt 


cctttgttac 


ttcttctgcc gcctgcttca 


aaccgctaac 


aatacctggg 


19307 


cccaccacac 


cgtgtgcatt 


cgtaatgtct 


gcccattctg 


ctattctgta 


tacacccgca 


19367 


gagtactgca 


atttgactgt 


attaccaatg 


tcagcaaatt 


ttctgtcttc 


gaagagtaaa 


19427 


aaattgtact 


tggcggataa 


tgcctttagc 


ggcttaactg 


tgccctccat 


ggaaaaatca 


19487 


gtcaagatat 


ccacatgtgt 


ttttagtaaa 


caaattttgg gacctaatgc 


ttcaactaac 


19547 


tccagtaatt 


ccttggtggt 


acgaacatcc 


aatgaagcac 


acaagtttgt 


ttgcttttcg 


19607 


tgcatgatat 


taaatagctt 


ggcagcaaca ggactaggat 


gagtagcagc 


acgttcctta 


19667 


tatgtagctt 


tcgacatgat 


ttatcttcgt 


ttcctgcagg 


tttttgttct 


gtgcagttgg 


19727 


gttaagaata 


ctgggcaatt 


tcatgtttct 


tcaacactac 


atatgcgtat 


atataccaat 


19787 


ctaagtctgt 


gctccttcct 


tcgttcttcc 


ttctgttcgg 


agattaccga 


atcaaaaaaa 


19847 


tttcaaggaa 


accgaaatca 


aaaaaaagaa 


taaaaaaaaa 


atgatgaatt 


gaaaagctta 


19907 
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tcgat 



19912 



<210> 9 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pd.deltaNS3NS5 



Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 

20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 



<400> 



9 



245 



250 



255 



59 



wo 01/38360 



PCTAJSOO/32326 



Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 

340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 
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Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 

660 665 670 

Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 



Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 
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Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 

915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 
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Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

.Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 
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61u Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu lie Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 10 

<211> 19798 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.deltaNS3NS5.pj 

<220> 

<221> CDS 

<222> (12679) . . (17991) 



64 



wo 01/38360 



PCTAJSOO/32326 



<400> 10 
atcgatccta 


ccccttgcgc 


taaagaagta 


tgtcactaaa 


cactggatta 


ttactcccag 


cggatcaacg 


ttcttaatat 


cgctgaatct 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tttctcagtg 


atctcccaga 


tgctttcacc 


actgtggcta 


tttcccttat 


ctgcttcttc 


tacaatatca 


gtgatatcag 


attgatgttt 


ccaagcagga 


atcaatttct 


ttaatgaggc 


tttaaactgg 


agtgatttat 


tgacaatatc 


atagctcatg 


aatgtggctc 


tcttgattgc 


ataggttagt 


Ccagcagcac 


ataatgctat 


cacaaactga 


cgaacaagca 


ccttaggtgg 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgcacagatt 


ttataatgta 


ataagcaaga 


agaaaaccaa 


aatggacgac 


attgaaacag 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttgatataga 


gagtaaacgt 


aagtctgatg 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ggtttgattc 


gattggaaat 


ggtatgctct 


atttgatgct 


acagaataac 


aagctgttag 


ctataataat 


aggaagattg 


cccgagaaag 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


agctcgtaag 


cgtcgttacc 


caattgctta 


taataggtga 


tttattcatc 


ccggaatctc 



tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


atacttattt 


tggactaatt 


taaatgattt 


120 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


tctagtttca 


actactctat 


ttatcttgta 


780 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


acctttgctg 


cttttcctta 


atttttagac 


960 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


acaatagaaa 


acaactatac 




1500 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 
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tggcggcaga gaatcgttta cagcaaaaaa 
accatgctaa tacaaatgaa gaagttccct 
caagaggagc atataaatta caaaacacca 
aaaaaaggag agtagcaacg agggtaaggg 
gatccaatat caaaggaaat gatagcattg 
cagcatatag aacagctaaa gggtagtgct 
gggataatat cacaggaggt actagactac 
gtacgcattt aagcataaac acgcactatg 
caacacgcag atataggtgc gacgtgaaca 
ttttcggaag cgctcgtttt cggaaacgct 
ctagaaagta taggaacttc agagcgcttt 
ttcaaaaaac caaaaacgca ccggactgta 
tccacaaaca ttgctcaaaa gtatctcttt 
aacctaccca tccacctttc gctccttgaa 
aggcttccaa tgctcttcaa attttactgt 
ctcttcataa tgtaagctta tctttatcga 
ctttacggtt ccctgagatt gaattagttc 
ctttgtacga cgaattttga ggttcgccat 
tattatctcc gcctcagttt gatcttccgc 
tatttcaccc cacaatcctt catccgcctc 
atgttgtaca ttgtttagtt cacgagaagg 
tatatgacct ttatcctgtt ctctttccac 
gcacctaata acattcttca aggcggagaa 
tgaaaacgtg agaatgaatt tagtattatt 
tcgaagataa gagaagaatg cagtgacctt 
aaaaaatacg cctttaggcc ttctgatacc 
attaatatct aaaccctctc cgatggtggc 
aaactgtgat aattctgggt gatttatgat 
aggatcaggc caatccagtt ctttttcaat 
tccaacaaat gcaaatgcta acgttttgta 



agagtgagtt ggaaggaagg actgaagtga 1800 
ccaggcgaac aagaagtaga gacacaaatg 1860 
tcactgaggg ccctaaagcg gttcccacga 1920 
gcagaaaatc acgtaatact tctagggtat 1980 
aaggatgaga ctaatccaat tgaggagtgg 2040 
gaaggaagca tacgataccc cgcatggaat 2100 
ctttcatcct acataaatag acgcatataa 2160 
ccgttcttct catgtatata tatatacagg 2220 
gtgagctgta tgtgcgcagc tcgcgttgca 22 80 
ttgaagttcc tattccgaag ttcctattct 2340 
tgaaaaccaa aagcgctctg aagacgcact 2400 
acgagctact aaaatattgc gaataccgct 2460 
gctatatatc tctgtgctat atccctatat 2520 
cttgcatcta aactcgacct ctacatcaac 2580 
caagtagacc catacggctg taatatgctg 2640 
atcgtgtgaa aaactactac cgcgataaac 2700 
ctttagtata tgatacaaga cacttttgaa 2760 
cctctggcta tttccaatta tcctgtcggc 2820 
ttcagactgc catttttcac ataatgaatc 2880 
cgcatcttgt tccgttaaac tattgacttc 2940 
gtcctcttca ggcggtagct cctgatctcc 3000 
aaacttagaa atgtattcat gaattatgga 3060 
gtttgggcca gatgcccaat atgcttgaca 3120 
gtgatattct gaggcaattt tattataatc 3180 
tgtattgaca aatggagatt ccatgtatct 3240 
ctttcccctg cggtttagcg tgccttttac 3300 
ctttaactga ctaataaatg caaccgatat 3360 
tcgatcgaca attgtattgt acactagtgc 3420 
taccggtgtg tcgtctgtat tcagtacatg 3480 
tttcttataa ttgtcaggaa ctggaaaagt 3540 
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cccccttgtc 


gtctcgatta 


cacacctact 


cataatacat 


tgcttaatac 


aagcaagcag 


cattacagct 


gatgtcattg 


tatatcagcg 


tcgcggtttt 


tataaacaaa 


actttcgtta 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


ttaacttcga 


gaagggatta 


aggctaattt 


ccattgaatg 


ccttataaaa 


cagctataga 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tgacattata 


aagctggcac 


ttagaattcc 


tctactgtac 


gatacacttc 


cgctcaggtc 


ttgttactct 


attgatccag 


ctcagcaaag 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


gctgccatca 


ttattatccg 


atgtgacgct 


tttttttttt 


tttttttttt 


ttttttggta 


agcaaggatt 


ttcttaactt 


cttcggcgac 


accacctaaa 


tcaccagttc 


tgatacctgc 


ggctttacct 


tcttcaggca 


agttcaatga 


agtggcgata 


gggttgacct 


tattctttgg 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


acccaaggag 


cctgggataa 


cggaggcttc 


ggtgattata 


ataccattta 


ggtgggttgg 


aatcaattga 


tgttgaactt 


tcaatgtagg 


ttttctccat 


aatcttgaag 


aggccaaaac 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


aacggtgtat 


tgttcactat 


cccaagcgac 


aaagtaaata 


cctcccacta 


attctctaac 


tggcttgatt 


ggagataagt 


ctaaaagaga 


ggcgtacaat 


tgaagttctt 


tacggatttt 


ggtaccccat 


ttaggaccac 


ccacagcacc 


ttccagcgcc 


tcatctggaa 


gtggaacacc 



ttcatcgtac accataggtt ggaagtgctg 3600 
tctctcgcca ttcatatttc agttattttc 3660 
ctgtaaaaat ctatctgtta cagaaggttt 372 0 
cgaaatcgag caatcacccc agctgcgtat 3780 
gagttgcatt ttttacacca taatgcatga 3840 
cactagtat.g tttcaaaaac ctcaatctgt 3900 
ttgcatagaa gagttagcta ctcaatgctt 3960 
tactttcagg cgggtctgta gtaaggagaa 4020 
acggactata gactatacta gtatactccg 4080 
cttgtccttt aacgaggcct taccactctt 4140 
gcagtgtgat ctaagattct atcttcgcga 4200 
actagaaatg caaaaggcac ttctacaatg 4260 
gcattttttt tttttttttt tttttttttt 4320 
caaatatcat aaaaaaagag aatcttttta 4380 
agcatcaccg acttcggtgg tactgttgga 4440 
atccaaaacc tttttaactg catcttcaat 4500 
caatttcaac atcattgcag cagacaagat 4560 
caaatctgga gcggaaccat ggcatggttc 4620 
caaagaggcc aaggacgcag atggcaacaa 4680 
atcggagatg atatcaccaa acatgttgct 4740 
gttcttaact aggatcatgg cggcagaatc 4800 
gaattcgttc ttgatggttt cctccacagt 4860 
attagcttta tccaaggacc aaataggcaa 4920 
ggccattctt gtgattcttt gcacttctgg 4980 
accatcacca tcgtcttcct ttctcttacc 5040 
aacaacgaag tcagtacctt tagcaaattg 5100 
gtcggatgca aagttacatg gtcttaagtt 5160 
tagtaaacct tgttcaggtc taacactacc 5220 
taacaaaacg gcatcagcct tcttggaggc 5280 
tgtagcatcg atagcagcac caccaattaa 5340 
67 



wo 01/38360 



PCTAJSOO/32326 



atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 




dcia^cii^oii^ciii. 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 






tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


a h> ii« a 


^ fata ft* 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 
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tggattagtc tcatccttca atgctatcat 
ccgagaaact agtgcgaagt agtgatcagg 
cctggccacg gcagaagcac gcttatcgct 
taggcccttc attgaaagaa atgaggtcat 
attttttata gcaaagattg aataaggcgc 
gactaagtta tcttttaata attggtattc 
atttactcgt tttaggactg gttcagaatt 
atcgatgata agctgtcaaa catgagaatt 
tatttttata ggttaatgtc atgataataa 
ggggaaatgt gcgcggaacc cctatttgtt 
cgctcatgag acaataaccc tgataaatgc 
gtattcaaca tttccgtgtc gcccttattc 
ttgctcaccc agaaacgctg gtgaaagtaa 
tgggttacat cgaactggat ctcaacagcg 
aacgttttcc aatgatgagc acttttaaag 
ttgacgccgg gcaagagcaa ctcggtcgcc 
agtactcacc agtcacagaa aagcatctta 
gtgctgccat aaccatgagt gataacactg 
gaccgaagga gctaaccgct tttttgcaca 
gttgggaacc ggagctgaat gaagccatac 
cagcaatggc aacaacgttg cgcaaactat 
ggcaacaatt aatagactgg atggaggcgg 
cccttccggc tggctggttt attgctgata 
gtatcattgc agcactgggg ccagatggta 
cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 
gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 



ttcctttgat attggatcat atgcatagta 7200 
tattgctgtt atctgatgag tatacgttgt 7260 
ccaatttccc acaacattag tcaactccgt 7320 
caaatgtctt ccaatgtgag attttgggcc 7380 
atttttcttc aaagctttat tgtacgatct 7440 
ctgtttattg cttgaagaat tgccggtcct 7500 
cctcaaaaat tcatccaaat atacaagtgg 7560 
cttgaagacg aaagggcctc gtgatacgcc 7620 
tggtttctta gacgtcaggt ggcacttttc 7680 
tatttttcta aatacattca aatatgtatc 7740 
ttcaataata ttgaaaaagg aagagtatga 7800 
ccttttttgc ggcattttgc cttcctgttt 7860 
aagatgctga agatcagttg ggtgcacgag 7920 
gtaagatcct tgagagtttt cgccccgaag 7980 
ttctgctatg tggcgcggta ttatcccgtg 804 0 
gcatacacta ttctcagaat gacttggttg 8100 
cggatggcat gacagtaaga gaattatgca 8160 
cggccaactt acttctgaca acgatcggag 8220 
acatggggga tcatgtaact cgccttgatc 8280 
caaacgacga gcgtgacacc acgatgcctg 8340 
taactggcga actacttact ctagcttccc 8400 
ataaagttgc aggaccactt ctgcgctcgg 84 60 
aatctggagc cggtgagcgt gggtctcgcg 8520 
agccctcccg tatcgtagtt atctacacga 8580 
atagacagat cgctgagata ggtgcctcac 8640 
tttactcata tatactttag attgatttaa 8700 
tgaagatcct ttttgataat ctcatgacca 8760 
gagcgtcaga ccccgtagaa aagatcaaag 8820 
taatctgctg cttgcaaaca aaaaaaccac 8880 
aagagctacc aactcttttt ccgaaggtaa 8940 
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ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc 9000 
accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag 9060 
tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 9120 
cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc agcttggagc 9180 
gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc gccacgcttc 9240 
ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 9300 
cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc 9360 
tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 9420 
ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct 9480 
ttcctgcgtt atcccctgat tctgtggata accgtattac cgcctttgag tgagctgata 9540 
ccgctcgccg cagccgaacg accgagcgca gcgagtcagt gagcgaggaa gcggaagagc 9600 
gcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc atatggtgca 9660 
ctctcagtac aatctgctct gatgccgcat agttaagcca gtatacactc cgctatcgct 9720 
acgtgactgg gtcatggctg cgccccgaca cccgccaaca cccgctgacg cgccctgacg 9780 
ggcttgtctg ctcccggcat ccgcttacag acaagctgtg accgtctccg ggagctgcat 9840 
gtgtcagagg ttttcaccgt catcaccgaa acgcgcgagg cagctgcggt aaagctcatc 9900 
agcgtggtcg tgaagcgatt cacagatgtc tgcctgttca tccgcgtcca gctcgttgag 9960 
tttctccaga agcgttaatg tctggcttct gataaagcgg gccatgttaa gggcggtttt 10020 
ttcctgtttg gtcactgatg cctccgtgta agggggattt ctgttcatgg gggtaatgat 10080 
accgatgaaa cgagagagga tgctcacgat acgggttact gatgatgaac atgcccggtt 10140 
actggaacgt tgtgagggta aacaactggc ggtatggatg cggcgggacc agagaaaaat 10200 
cactcagggt caatgccagc gcttcgttaa tacagatgta ggtgttccac agggtagcca 10260 
gcagcatcct gcgatgcaga tccggaacat aatggtgcag ggcgctgact tccgcgtttc 10320 
cagactttac gaaacacgga aaccgaagac cattcatgtt gttgctcagg tcgcagacgt 10380 
tttgcagcag cagtcgcttc acgttcgctc gcgtatcggt gattcattct gctaaccagt 10440 
aaggcaaccc cgccagccta gccgggtcct caacgacagg agcacgatca tgcgcacccg lOSOO 
tggccaggac ccaacgctgc ccgagatgcg ccgcgtgcgg ctgctggaga tggcggacgc 10560 
gatggatatg ttctgccaag ggttggtttg cgcattcaca gttctccgca agaattgatt 10620 
ggctccaatt cttggagtgg tgaatccgtt agcgaggtgc cgccggcttc cattcaggtc 10680 
gaggtggccc ggctccatgc accgcgacgc aacgcgggga ggcagacaag gtatagggcg 10740 
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gcgcctacaa tccatgccaa cccgttccat gtgctcgccg aggcggcata aatcgccgtg 10800 
acgatcagcg gtccaatgat cgaagttagg ctggtaagag ccgcgagcga tccttgaagc 10860 
tgtccctgat ggtcgtcatc tacctgcctg gacagcatgg cctgcaacgc gggcatcccg 10920 
atgccgccgg aagcgagaag aatcataatg gggaaggcca tccagcctcg cgtcgcgaac 10980 
gccagcaaga cgtagcccag cgcgtcggcc gccatgccgg cgataatggc ctgcttctcg 11040 
ccgaaacgtt tggtggcggg accagtgacg aaggcttgag cgagggcgtg caagattccg lllOO 
aataccgcaa gcgacaggcc gatcatcgtc gcgctccagc gaaagcggtc ctcgccgaaa 11160 
atgacccaga gcgctgccgg cacctgtcct acgagttgca tgataaagaa gacagtcata 11220 
agtgcggcga cgatagtcat gccccgcgcc caccggaagg agctgactgg gttgaaggct 11280 
ctcaagggca tcggtcgagg atccttcaat atgcgcacat acgctgttat gttcaaggtc 11340 
ccttcgttta agaacgaaag cggtcttcct tttgagggat gtttcaagtt gttcaaatct 11400 
atcaaatttg caaatcccca gtctgtatct agagcgttga atcggtgatg cgatttgtta 11460 
attaaattga tggtgtcacc attaccaggt ctagatatac caatggcaaa ctgagcacaa 11520 
caataccagt ccggatcaac tggcaccatc tctcccgtag tctcatctaa tttttcttcc 11580 
ggatgaggtt ccagatatac cgcaacacct ttattatggt ttccctgagg gaataataga 1164 0 
atgtcccatt cgaaatcacc aattctaaac ctgggcgaat tgtatttcgg gtttgttaac 11700 
tcgttccagt caggaatgtt ccacgtgaag ctatcttcca gcaaagtctc cacttcttca 11760 
tcaaattgtg gagaatactc ccaatgctct tatctatggg acttccggga aacacagtac 11820 
cgatacttcc caattcgtct tcagagctca ttgtttgttt gaagagacta atcaaagaat 11880 
cgttttctca aaaaaattaa tatcttaact gatagtttga tcaaaggggc aaaacgtagg 11940 
ggcaaacaaa cggaaaaatc gtttctcaaa ttttctgatg ccaagaactc taaccagtct 12000 
tatctaaaaa ttgccttatg atccgtctct ccggttacag cctgtgtaac tgattaatcc 12060 
tgcctttcta atcaccattc taatgtttta attaagggat tttgtcttca ttaacggctt 12120 
tcgctcataa aaatgttatg acgttttgcc cgcaggcggg aaaccatcca cttcacgaga 12180 
ctgatctcct ctgccggaac accgggcatc tccaacttat aagttggaga aataagagaa 12240 
tttcagattg agagaatgaa aaaaaaaaac ccttagttca taggtccatt ctcttagcgc 12300 
aactacagag aacaggggca caaacaggca aaaaacgggc acaacctcaa tggagtgatg 12360 
caacctgcct ggagtaaatg atgacacaag gcaattgacc cacgcatgta tctatctcat 12420 
tttcttacac cttctattac cttctgctct ctctgatttg gaaaaagctg aaaaaaaagg 12480 
ttgaaaccag ttccctgaaa ttattcccct acttgactaa taagtatata aagacggtag 12540 
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gtattgattg taattctgta aatctatttc ttaaacttct taaattctac ttttatagtt 12600 

agtctttttt ttagttttaa aacaccaaga acttagtttc gaataaacac acataaacaa 12660 

acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tct gtt get gca aca etg ggc ttt ggt get tac 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get cat ggg ate gat cet aac ate agg ace ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att acc act ggc age ccc ate aeg tac tec acc tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gee gae ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys 
60 65 70 75 

gae gag tgc cac tec aeg gat gee aca tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act gcg ggg gcg aga etg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

ace gee ace cet ceg ggc tec gtc act gtg ccc eat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get etg tec acc acc gga gag ate cet ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gee gca aag etg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gee gtg gee tac tac cgc ggt ctt gac gtg tec gtc ate ceg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

acc age ggc gat gtt gtc gtc gtg gca ace gat gee etc atg acc ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat acc ggc gac ttc gac teg gtg ata gac tgc aat aeg tgt gtc acc 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 
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cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie 
220 225 230 235 

acg etc cec caa gat get gte tec cgc act caa cgt egg gge agg aet 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

gge agg ggg aag eea gge ate tae aga ttt gtg gea eeg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ece tee gge atg tte gac teg tec gtc etc tgt gag tgc tat gac gea 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

gge tgt get tgg tat gag etc acg cec gee gag aet aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

ega geg tae atg aae ace eeg ggg ctt ece gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag gge gte ttt aca gge etc act cat ata gat gee cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aae ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

geg tac caa gee ace gtg tgc get agg get caa gee cct cec eea teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag cec acc etc eat 13615 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg eea aca cec ctg eta tae aga ctg gge get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

ace ctg acg cac eea gtc acc aaa tac ate atg aca tgc atg teg gcc 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gtc gtc acg age acc tgg gtg etc gtt gge gge gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gcc geg tat tgc ctg tea aca gge tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 
430 435 440 

gge agg gtc gtc ttg tec ggg aag eeg gea ate ata cct gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
445 450 455 
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gtc etc tac cga gag ttc gat gag atg gaa gag tgc tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gcc gag cag ttc aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag acc gcg tec cgt cag gca gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

ect get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg ect ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Net Ala Phe Thr Ala 
540 545 550 555 

get gtc ace age eca eta ace act age caa ace etc etc ttc aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gee eag etc gee gee eee ggt gee get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gae ate ctt gca ggg tat ggc gcg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

gcg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gae ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gca gca ata ctg ege egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gee ttc gee 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 
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gca get gcc cgc gtc act gcc ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg ega ctg cae cag tgg ata age teg gag tgt ace act cea 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tee ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag acc tgg eta aaa get aag etc atg cea cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cet ggg ate cec ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

cga ggg gac ggc ate atg eac act cgc tgc cae tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt cet agg ace 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg ace tte cec att aat gcc tac acc acg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

ggc cec tgt ace cec ett cet geg eeg aac tac acg. tte gcg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac tte 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cae tac gtg acg ggt atg act act gac aat ctt aaa tgc eeg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cea teg cec gaa ttt tte aca gaa ttg gac ggg gtg cgc eta eat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 

agg ttt gcg cec cec tgc aag cec ttg ctg egg gag gag gta tea tte 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cae gaa tac eeg gta ggg teg caa tta cet tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 

910 915 920 

cec gaa eeg gac gtg gee gtg ttg acg tee atg etc act gat cec tee 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 
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cat ata aca gca gag gcg gcc ggg cga agg ttg gcg agg gga tea ccc 
His lie Thr Ala Glu Ala Ala 61y Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 



15543 



ccc tct gtg gcc age tec teg get age cag eta tec get eea tct etc 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 



15591 



aag gca act tgc ace get aac eat gac tec ect gat get gag etc ata 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie 
975 980 985 



15639 



gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate acc agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 



15687 



gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ett 
Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 



15735 



gtg gcg gag gag gee gag egg gag ate tee gta cee gca gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 



15783 



egg aag tct egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg eeg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 



gac tat aac ecc ccg eta gtg gag aeg tgg aaa aag ccc gac tac gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 



15879 



eea ect gtg gtc cat ggc tgc ccg ett eea ect cca aag tec ect ect 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 



15927 



gtg cet ccg ect egg aag aag egg acg gtg gtc etc act gaa tea ace 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 



15975 



eta tct act gcc ttg gcc gag etc gee acc aga age ttt ggc age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 



16023 



tea act tec ggc att aeg ggc gac aat acg aca aca tee tct gag ccc 
Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 



16071 



gcc ect tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tee 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 



16119 



atg ccc ccc ctg gag ggg gag cet ggg gat ccg gat ett age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 



16167 



tea tgg tea aeg gtc agt agt gag gee aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 



16215 
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tgc tea atg tct tac tct tgg aca ggc gca etc gtc ace eeg tgc gcc 16263 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 

gcg gaa gaa eag aaa ctg ccc ate aat gca eta age aac teg ttg eta 16311 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 

cgt cac cac aat ttg gtg tat tec ace ace tea egc agt get tgc caa 16359 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 

agg eag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 16407 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 

tac eag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 16455 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 

get aac ttg eta tec gta gag gaa get tgc age ctg aeg ccc cea cac 16503 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 

tea gee aaa tec aag ttt ggt tat ggg gca aaa gac gtc cgt tgc cat 16551 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 

gcc aga aag gee gta ace cac ate aac tec gtg tgg aaa gac ctt ctg 16599 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 

gaa gac aat gta aca cea ata gac act ace ate atg get aag aac gag 16647 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 

gtt ttc tgc gtt eag ect gag aag ggg ggt cgt aag cea get cgt etc 16695 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 

ate gtg ttc ccc gat ctg ggc gtg egc gtg tgc gaa aag atg get ttg 16743 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 

tac gac gtg gtt aca aag etc ccc ttg gee gtg atg gga age tec tac 16791 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 

gga ttc caa tac tea cea gga cag egg gtt gaa ttc etc gtg caa gcg 16839 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 

tgg aag tec aag aaa ace cea atg ggg ttc teg tat gat aec egc tgc 16887 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 

ttt gac tec aca gtc act gag age gac ate cgt aeg gag gag gca ate 16935 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
1405 1410 1415 
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tac caa tgt tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tee 16983 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 

etc ace gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 17031 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 

gag aac tge ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act 17079 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 

age tgt ggt aac ace etc act tgc tac ate aag gcc egg gca gcc tgt 17127 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
1470 1475 1480 

cga gcc gca ggg etc cag gac tgc acc atg etc gtg tgt ggc gac gac 17175 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 

tta gtc gtt ate tgt gaa age gcg ggg gte cag gag gac gcg gcg age 17223 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 

ctg aga gcc ttc acg gag get atg acc agg tac tec gcc ccc cct ggg 17271 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 

gac ccc cea caa cea gaa tac gac ttg gag etc ata aca tea tge tec 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1535 1540 1545 

tec aac gtg tea gte gee eac gac ggc get gga aag agg gtc tac tac 17367 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 

gca aga eac act cea gtc aat tec tgg eta ggc aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe 
1580 1585 1590 1595 

gcc ccc aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 

gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate 17559 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gee tgc tac tec ata gaa cea ctg gat eta cct cea ate att 17607 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc eat ggc etc age gca ttt tea etc eac agt tac tet cea 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 
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ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ett ggg gta ecg 17703 
Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

ccc ttg cga get tgg aga cac egg gee egg age gte ege get agg ett 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

ctg gee aga gga gge agg get gee ata tgt ggc aag tac etc tte aae 17799 
Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gea gta aga aea aag ete aaa cte aet eea ata gcg gee get gge 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gae ttg tec gge tgg tte aeg get ggc tac age ggg gga gae 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat eae age gtg tet cat gee egg ccc ege tgg ate tgg ttt tgc 17943 
lie Tyr His Ser Val Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ett get gca ggg gta ggc ate tac cte ete ccc aae cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 





1760 




1765 




1770 




tgaatagteg 


actttgttcc 


cactgtaett 


ttagetcgta 


caaaatacaa 


tatacttttc 


18051 


atttcteegt 


aaaeaacatg 


ttttcecatg 


taatatcett 


ttetattttt 


cgttecgtta 


18111 


ceaaetttae 


acatacttta 


tatagetatt 


caettetata 


eactaaaaaa 


ctaagacaat 


18171 


tttaattttg 


ctgectgeca 


tatttcaatt 


tgttataaat 


tectataatt 


tateetatta 


18231 


gtagctaaaa 


aaagatgaat 


gtgaatcgaa 


tectaagaga 


attggatctg 


atceacagga 


18291 


egggtgtggt 


egceatgate 


gcgtagtega 


tagtggetec 


aagtagegaa 


gegageagga 


18351 


ctgggcggcg 


gecaaagegg 


tcggacagtg 


ctecgagaac 


gggtgcgeat 


agaaattgea 


18411 


tcaaegcata 


tagegctage 


agcaegceat 


agtgaetggc 


gatgetgteg 


gaatggaega 


18471 


tatcccgeaa 


gaggcccgge 


agtaecggea 


taaeeaagce 


tatgeetaca 


gcatceaggg 


18531 


tgaeggtgcc 


gaggatgacg 


atgagegeat 


tgttagattt 


eataeacggt 


gectgactgc 


18591 


gttageaatt 


taaetgtgat 


aaactacege 


attaaagctt 


tttcttteca 


attttttttt 


18651 


tttegtcatt 


ataaaaatca 


ttacgaccga 


gatteeeggg 


taataactga 


tataattaaa 


18711 


ttgaagctet 


aatttgtgag 


tttagtatae 


atgeatttae 


ttataataea 


gttttttagt 


18771 


tttgetggcc 


geatcttcte 


aaatatgett 


eccagectgc 


ttttctgtaa 


cgttcaeeet 


18831 


etaecttagc 


ateecttcee 


tttgeaaata 


gteetcttce 


aaeaataata 


atgteagate 


18891 


ctgtagagae 


eacatcatec 


acggttctat 


aetgttgace 


caatgegtct 


eecttgtcat 


18951 
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ctaaacccac 


accgggtgtc 


ataatcaacc 


aatcgtaacc 


ttcatctctt 


ccacccatgt 


19011 


ctctttgagc 


aataaagccg 


ataacaaaat 


ctttgtcgct 


cttcgcaatg 


tcaacagtac 


19071 


ccttagtata 


ttctccagta 


gatagggagc 


ccttgcatga 


caattctgct 


aacatcaaaa 


19131 


ggcctctagg 


ttcctttgtt 


acttcttctg 


ccgcctgctt 


caaaccgcta 


acaatacctg 


19191 


ggcccaccac 


accgtgtgca 


ttcgtaatgt 


ctgcccattc 


tgctattctg 


tatacacccg 


19251 


cagagtactg 


caatttgact 


gtattaccaa 


tgtcagcaaa 


ttttctgtct 


tcgaagagta 


19311 


aaaaattgta. 


cttggcggat 


aatgccttta 


gcggcttaac 


tgtgccctcc 


atggaaaaat 


19371 


cagtcaagat 


atccacatgt 


gtttttagta 


aacaaatttt 


gggacctaat 


gcttcaacta 


19431 


actccagtaa 


ttccttggtg 


gtacgaacat 


ccaatgaagc 


acacaagttt 


gtttgctttt 


19491 


cgtgcatgat 


attaaatagc 


ttggcagcaa 


caggactagg 


atgagtagca gcacgttcct 


19551 


tatatgtagc 


tttcgacatg 


atttatcttc 


gtttcctgca ggtttttgtt 


ctgtgcagtt 


19611 


gggttaagaa 


tactgggcaa 


tttcatgttt 


cttcaacact 


acatatgcgt 


atatatacca 


19671 


atctaagtct 


gtgctccttc 


cttcgttctt 


ccttctgttc 


ggagattacc 


gaatcaaaaa 


19731 


aatttcaagg 


aaaccgaaat 


caaaaaaaag 


aataaaaaaa 


aaatgatgaa 


ttgaaaagct 


19791 


tatcgat 












19798 



<210> 11 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description o£ Artificial Sequence: 
pd.deltaNS3NS5.pj 

<400> 11 

Net Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 

15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly He Asp Pro Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 
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Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
ICQ 105 110 

Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val 
130 135 140 

lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 260 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 



Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 
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Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 

435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 

500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 

675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 
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Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe 

755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 

820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Net Leu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 
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Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 
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Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 
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Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 12 
<211> 20220 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.delta.NS3NS5.pj .corel21 

<220> 
<221> CDS 

<222> (12679) . . (18354) 



<400> 12 
atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct gtttgtctcg 


840 
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cgttctttcg aaaaatgcac cggccgcgca 
cggtatcttc atttcatatt ttaaaaatgc 
ggcccgcagg ttcgttttgc ggtactatct 
tgcacagatt ttataatgta ataagcaaga 
agaaaaccaa aatggacgac attgaaacag 
cttatagcgt ctgggatgta tgtcggctgt 
ttgatataga gagtaaacgt aagtctgatg 
ccatggaatc tctcacaacc ggtaggccgt 
gcgtatcttc tgactccagt gctgaggtaa 
ggtttgattc gattggaaat ggtatgctct 
atttgatgct acagaataac aagctgttag 
ctataataat aggaagattg cccgagaaag 
gaaaaatgga ttgtacacag ttattagtcc 
agctcgtaag cgtcgttacc caattgctta 
taataggtga tttattcatc ccggaatctc 
tggcggcaga gaatcgttta cagcaaaaaa 
accatgctaa tacaaatgaa gaagttccct 
caagaggagc atataaatta caaaacacca 
aaaaaaggag agtagcaacg agggtaaggg 
gatccaatat caaaggaaat gatagcattg 
cagcatatag aacagctaaa gggtagtgct 
gggataatat cacaggaggt actagactac 
gtacgcattt aagcataaac acgcactatg 
caacacgcag atataggtgc gacgtgaaca 
ttttcggaag cgctcgtttt cggaaacgct 
ctagaaagta taggaacttc agagcgcttt 
ttcaaaaaac caaaaacgca ccggactgta 
tccacaaaca ttgctcaaaa gtatctcttt 
aacctaccca tccacctttc gctccttgaa 
aggcttccaa tgctcttcaa attttactgt 



ttatttgtac tgcgaaaata attggtactg 900 
acctttgctg cttttcctta atttttagac 960 
tgtgataaaa agttgttttg acatgtgatc 1020 
atacattatc aaacgaacaa tactggtaaa 1080 
ccaagaatct gacggtaaaa gcacgtacag 1140 
ttattgaaat gattgctcct gatgtagata 1200 
agctactctt tccaggatat gtcataaggc 1260 
atggtcttga ttctagcgca gaagattcca 1320 
ttttgcctgc tgcgaagatg gttaaggaaa 1380 
cttcacaaga agcaagtcag gctgccatag 1440 
acaatagaaa gcaactatac aaatctattg 1500 
acaagaagag agctaccgaa atgctcatga 1560 
caccagctcc aacggaagaa gatgttatga 1620 
ctttagttcc accagatcgt caagctgctt 1680 
taaaggatat attcaatagt ttcaatgaac 1740 
agagtgagtt ggaaggaagg actgaagtga 1800 
ccaggcgaac aagaagtaga gacacaaatg 1860 
tcactgaggg ccctaaagcg gttcccacga 1920 
gcagaaaatc acgtaatact tctagggtat 1980 
aaggatgaga ctaatccaat tgaggagtgg 2040 
gaaggaagca tacgataccc cgcatggaat 2100 
ctttcatcct acataaatag acgcatataa 2160 
ccgttcttct catgtatata tatatacagg 2220 
gtgagctgta tgtgcgcagc tcgcgttgca 2280 
ttgaagttcc tattccgaag ttcctattct 2340 
tgaaaaccaa aagcgctctg aagacgcact 2400 
acgagctact aaaatattgc gaataccgct 2460 
gctatatatc tctgtgctat atccctatat 2520 
cttgcatcta aactcgacct ctacatcaac 2580 
caagtagacc catacggctg taatatgctg 2640 
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ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgcLtggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 
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accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


QCQQaaccat 


QCtcataqttC 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaqgacQcaQ 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


acrqatcatcTQ 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tat t caaa tc 


t" a a fa r* t a f 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 
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taaggttaga agaaggctac tttggtgtct 
cacttcccgc gtttactgat tactagcgaa 
atccccgatt atattctata ccgatgtgga 
gcgttgatga ttcttcattg gtcagaaaat 
atactacgta taggaaatgt ttacattttc 
tcttactaca attttttcgt ctaaagagta 
gtcgagttta gatgcaagtt caaggagcga 
agcacagaga tatatagcaa agagatactt 
aatattttag tagctcgtta cagtccggtg 
gagcgctttt ggttttcaaa agcgctctga 
tcggaatagg aacttcaaag cgtttccgaa 
tgcgcacata cagctcactg ttcacgtcgc 
tatacatgag aagaacggca tagtgcgtgt 
atttatgtag gatgaaaggt agtctagtac 
gtatcgtatg cttccttcag cactaccctt 
tggattagtc tcatccttca atgctatcat 
ccgagaaact agtgcgaagt agtgatcagg 
cctggccacg gcagaagcac gcttatcgct 
taggcccttc attgaaagaa atgaggtcat 
attttttata gcaaagattg aataaggcgc 
gactaagtta tcttttaata attggtattc 
atttactcgt tttaggactg gttcagaatt 
atcgatgata agctgtcaaa catgagaatt 
tatttttata ggttaatgtc atgataataa 
ggggaaatgt gcgcggaacc cctatttgtt 
cgctcatgag acaataaccc tgataaatgc 
gtattcaaca tttccgtgtc gcccttattc 
ttgctcaccc agaaacgctg gtgaaagtaa 
tgggttacat cgaactggat ctcaacagcg 
aacgttttcc aatgatgagc acttttaaag 



attttctctt ccataaaaaa agcctgactc 6300 
gctgcgggtg cattttttca agataaaggc 6360 
ttgcgcatac tttgtgaaca gaaagtgata 642 0 
tatgaacggt ttcttctatt ttgtctctat 6480 
gtattgtttt cgattcactc tatgaatagt 6540 
atactagaga taaacataaa aaatgtagag 6600 
aaggtggatg ggtaggttat atagggatat 6660 
ttgagcaatg tttgtggaag cggtattcgc 6720 
cgtttttggt tttttgaaag tgcgtcttca 6780 
agttcctata ctttctagag aataggaact 6840 
aacgagcgct tccgaaaatg caacgcgagc 6900 
acctatatct gcgtgttgcc tgtatatata 6960 
ttatgcttaa atgcgtactt atatgcgtct 7020 
ctcctgtgat attatcccat tccatgcggg 7080 
tagctgttct atatgctgcc actcctcaat 714 0 
ttcctttgat attggatcat atgcatagta 7200 
tattgctgtt atctgatgag tatacgttgt 7260 
ccaatttccc acaacattag tcaactccgt 7320 
caaatgtctt ccaatgtgag attttgggcc 7380 
atttttcttc aaagctttat tgtacgatct 7440 
ctgtttattg cttgaagaat tgccggtcct 7500 
cctcaaaaat tcatccaaat atacaagtgg 7560 
cttgaagacg aaagggcctc gtgatacgcc 7620 
tggtttctta gacgtcaggt ggcacttttc 7680 
tatttttcta aatacattca aatatgtatc 7740 
ttcaataata ttgaaaaagg aagagtatga 7800 
ccttttttgc ggcattttgc cttcctgttt 7860 
aagatgctga agatcagttg ggtgcacgag 7920 
gtaagatcct tgagagtttt cgccccgaag 7980 
ttctgctatg tggcgcggta ttatcccgtg 8040 
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ttgacgccgg gcaagagcaa ctcggtcgcc 
agtactcacc agtcacagaa aagcatctta 
gtgctgccat aaccatgagt gataacactg 
gaccgaagga gctaaccgct tttttgcaca 
gttgggaacc ggagctgaat gaagccatac 
cagcaatggc aacaacgttg cgcaaactat 
ggcaacaatt aatagactgg atggaggcgg 
cccttccggc tggctggttt attgctgata 
gtatcattgc agcactgggg ccagatggta 
cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 
gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 
ctggcttcag cagagcgcag ataccaaata 
accacttcaa gaactctgta gcaccgccta 
tggctgctgc cagtggcgat aagtcgtgtc 
cggataaggc gcagcggtcg ggctgaacgg 
gaacgaccta caccgaactg agatacctac 
ccgaagggag aaaggcggac aggtatccgg 
cgagggagct tccaggggga aacgcctggt 
tctgacttga gcgtcgattt ttgtgatgct 
ccagcaacgc ggccttttta cggttcctgg 
ttcctgcgtt atcccctgat tctgtggata 
ccgctcgccg cagccgaacg accgagcgca 
gcctgatgcg gtattttctc cttacgcatc 
ctctcagtac aatctgctct gatgccgcat 
acgtgactgg gtcatggctg cgccccgaca 
ggcttgtctg ctcccggcat ccgcttacag 



gcatacacta ttctcagaat gacttggttg 8100 
cggatggcat gacagtaaga gaattatgca 8160 
cggccaactt acttctgaca acgatcggag 8220 
acatggggga tcatgtaact cgccttgatc 8280 
caaacgacga gcgtgacacc acgatgcctg 8340 
taactggcga ac tact tact ctagcttccc 8400 
ataaagttgc aggaccactt ctgcgctcgg 8460 
aatctggagc cggtgagcgt gggtctcgcg 8520 
agccctcccg tatcgtagtt atctacacga 8580 
atagacagat cgctgagata ggtgcctcac 8640 
tttactcata tatactttag attgatttaa 8700 
tgaagatcct ttttgataat ctcatgacca 8760 
gagcgtcaga ccccgtagaa aagatcaaag 8820 
taatctgctg cttgcaaaca aaaaaaccac 8880 
aagagctacc aactcttttt ccgaaggtaa 8940 
ctgtccttct agtgtagccg tagttaggcc 9000 
catacctcgc tctgctaatc ctgttaccag 9060 
ttaccgggtt ggactcaaga cgatagttac 9120 
ggggttcgtg cacacagccc agcttggagc 9180 
agcgtgagct atgagaaagc gccacgcttc 9240 
taagcggcag ggtcggaaca ggagagcgca 9300 
atctttatag tcctgtcggg tttcgccacc 9360 
cgtcaggggg gcggagccta tggaaaaacg 9420 
ccttttgctg gccttttgct cacatgttct 9480 
accgtattac cgcctttgag tgagctgata 9540 
gcgagtcagt gagcgaggaa gcggaagagc 9600 
tgtgcggtat ttcacaccgc atatggtgca 9660 
agttaagcca gtatacactc cgctatcgct 9720 
cccgccaaca cccgctgacg cgccctgacg 9780 
acaagctgtg accgtctccg ggagctgcat 9840 
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gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11560 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg gaataataga 


11640 
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atgtcccatt cgaaatcacc aattctaaac ctgggcgaat tgtatttcgg gtttgttaac 11700 

tcgttccagt caggaatgtt ccacgtgaag ctatcttcca gcaaagtctc cacttcttca 11760 

tcaaattgtg gagaatactc ccaatgctct tatctatggg acttccggga aacacagtac 11820 

cgatacttcc caattcgtct tcagagctca ttgtttgttt gaagagacta atcaaagaat 11880 

cgttttctca aaaaaattaa tatcttaact gatagtttga tcaaaggggc aaaacgtagg 11940 

ggcaaacaaa cggaaaaatc gtttctcaaa ttttctgatg ccaagaactc taaccagtct 12000 

tatctaaaaa ttgccttatg atccgtctct ccggttacag cctgtgtaac tgattaatcc 12060 

tgcctttcta atcaccattc taatgtttta attaagggat tttgtcttca ttaacggctt 12120 

tcgctcataa aaatgttatg acgttttgcc cgcaggcggg aaaccatcca cttcacgaga 12180 

ctgatctcct ctgccggaac accgggcatc tccaacttat aagttggaga aataagagaa 12240 

tttcagattg agagaatgaa aaaaaaaaac ccttagttca taggtccatt ctcttagcgc 12300 

aactacagag aacaggggca caaacaggca aaaaacgggc acaacctcaa tggagtgatg 12360 

caacctgcct ggagtaaatg atgacacaag gcaattgacc cacgcatgta tctatctcat 12420 

tttcttacac cttctattac cttctgctct ctctgatttg gaaaaagctg aaaaaaaagg 12480 

ttgaaaccag ttccctgaaa ttattcccct acttgactaa taagtatata aagacggtag 12540 

gtattgattg taattctgta aatctatttc ttaaacttct taaattctac ttttatagtt 12600 

agtctttttt ttagttttaa aacaccaaga acttagtttc gaataaacac acataaacaa 12660 

acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tct gtt get gca aca ctg ggc ttt ggt get tae 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tee aag get eat ggg ate gat cct aac ate agg ace ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att acc act ggc age ccc ate acg tae tec ace tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gcc gac ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He lie He CyB 
60 65 70 75 

gac gag tgc cac tec acg gat gcc aca tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gcc 12999 
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Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

acc gcc acc cct ccg ggc tec gtc act gtg ccc cat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
110 115 120 

gag gtt get ctg tec ace ace gga gag ate cct ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga eat etc ate ttc tgt eat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gcc gea aag ctg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gcc gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ccg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

ace age ggc gat gtt gtc gtc gtg gca acc gat gee etc atg ace ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat acc ggc gac ttc gac teg gtg ata gac tgc aat acg tgt gtc acc 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 

acg etc ccc caa gat get gtc tec cgc act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cea ggc ate tac aga ttt gtg gca ccg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ccc tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg ccc gcc gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

ega gcg tac atg aac acc ccg ggg ctt ccc gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gcc cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His 
320 325 330 
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ttt eta tec cag aca aag cag agt ggg gag aae ctt cct tae ctg gta 13719 
Phe Leu Ser Gin Thr Lye Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

geg tac caa gee ace gtg tge get agg get caa gee cct cec cca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag ecc ace etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg eca aca cec etg eta tae aga ctg gge get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

ace etg aeg cae eca gte ace aaa tae ate atg aca tge atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gte gtc aeg age acc tgg gtg etc gtt gge gge gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gcc geg tat tge etg tea aca gge tge gtg gte ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
430 435 440 

gge agg gtc gtc ttg tec ggg aag ccg gea ate ata cct gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
445 450 455 

gte etc tac cga gag ttc gat gag atg gaa gag tge tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ceg tac ate gag caa ggg atg atg etc gee gag cag tte aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc gge etc ctg cag acc geg tee cgt cag gea gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

cct get gtc cag acc aae tgg caa aaa etc gag acc ttc tgg geg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aae tte ate agt ggg ata caa tae ttg geg gge ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

aeg ctg cct ggt aae cec gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gte acc age cca eta acc act age caa ace etc etc ttc aae ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie 
560 565 570 
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ttg ggg ggg tgg gtg get gcc cag etc gee gee eee ggt gee get aet 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
57S 580 585 

gee ttt gtg gge get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gac ate ctt gca ggg tat ggc gcg gge gtg 14535 
Leu Gly Lys Val Leu lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

9cg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gca gca ata ctg cge egg eac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gee ttc gcc 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala 
670 675 680 

tee egg ggg aac cat gtt tec ccc acg eac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gca get gcc cge gtc act gcc ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg ega ctg eac cag tgg ata age teg gag tgt ace aet cea 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag ace tgg eta aaa get aag etc atg eca cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

ect ggg ate ccc ttt gtg tec tgc cag cge ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

ega ggg gac gge ate atg eac act cge tgc eac tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt ect agg acc 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 
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tgc agg aac atg tgg agt ggg acc ttc ccc att aat gcc tac acc acg 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr 
815 820 825 



15159 



ggc ccc tgt acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 



15207 



999 gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 



15255 



cac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 



15303 



gtc cca teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta eat 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 



15351 



agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 



15399 



aga gta gga etc eae gaa tac ccg gta ggg teg eaa tta cct tgc gag 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 



15447 



ccc gaa ccg gac gtg gcc gtg ttg acg tec atg etc act gat ccc tec 
Pro Glu Pro Asp Val Ala val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 



15495 



cat ata aca gca gag gcg gee ggg cga agg ttg gcg agg gga tea ccc 
His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 



15543 



ccc tct gtg gee age tec teg get age cag eta tec get cca tet etc 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 



15591 



aag gca act tgc acc get aac cat gac tec cct gat get gag etc ata 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He 
975 980 985 



15639 



gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate acc agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg 
990 995 1000 



15687 



gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 
Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 



15735 



gtg gcg gag gag gac gag egg gag ate tec gta ccc gca gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 



15783 



egg aag tct egg aga ttc gee cag gee ctg ccc gtt tgg gcg egg ccg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 
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gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 



15879 



cca cct gtg gtc cat ggc tgc ccg ctt oca cct cca aag tec ect ect 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 



15927 



gtg cct ccg cct egg aag aag egg acg gtg gtc etc act gaa tea ace 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 



15975 



eta tct act gee ttg gee gag etc gcc ace aga age ttt ggc age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 



16023 



tea act tec ggc att acg ggc gac aat acg aca aca tec tct gag ccc 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 



16071 



gee cct tct ggc tgc cee ccc gac tee gac get gag tec tat tec tee 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 



16119 



atg ccc ccc ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 



16167 



tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cye 
1165 1170 1175 



16215 



tgc tea atg tct tac tct tgg aca ggc gca etc gtc ace ccg tgc gcc 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 



16263 



gcg gaa gaa cag aaa ctg ccc ate aat gca eta age aac teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



egt cae eac aat ttg gtg tat tec ace ace tea egc agt get tgc caa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tac cag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 



get aac ttg eta tec gta gag gaa get tgc age ctg acg ccc cca eac 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 



16503 



tea gcc aaa tec aag ttt ggt tat ggg gca aaa gac gtc egt tgc cat 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 



16551 
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gcc aga aag gcc gta acc cac ate aac tec gtg tgg aaa gae ctt ctg 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gae aat gta aca cea ata gae aet aec ate atg get aag aae gag 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cct gag aag ggg ggt egt aag cca get cgt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg tte cee gat etg gge gtg ege gtg tgc gaa aag atg get ttg 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gae gtg gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 



gga ttc caa tac tea cca gga cag egg gtt gaa tte etc gtg caa gcg 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 



16839 



tgg aag tec aag aaa acc cca atg ggg ttc teg tat gat aec cgc tgc 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 



16887 



ttt gae tec aca gtc act gag age gae ate egt aeg gag gag gea ate 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He 
1405 1410 1415 



16935 



tac caa tgt tgt gae etc gae ccc caa gcc cgc gtg gee ate aag tec 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser 
1420 1425 1430 1435 



16983 



etc acc gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 



17031 



gag aae tgc ggc tat cgc agg tgc ege gcg age gge gta etg aca act 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 



17079 



age tgt ggt aae aec etc aet tgc tac ate aag gcc egg gea gcc tgt 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
1470 1475 1480 



17127 



cga gcc gea ggg etc cag gae tgc acc atg etc gtg tgt ggc gae gae 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 



17175 



tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gae gcg gcg age 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 



17223 



etg aga gee tte aeg gag get atg aec agg tac tec gee ccc cct ggg 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 



17271 
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gac ccc cca caa cca gaa tac gac ttg gag etc ata aca tea tgc tec 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1535 1540 1545 



17319 



tec aac gtg tea gtc gee cac gac ggc get gga aag agg gtc tac tac 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 



17367 



etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg gag aca 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 



17415 



gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Phe 
1580 1585 1590 1595 



17463 



gee ccc aca etg tgg gcg agg atg ata etg atg ace cat ttc ttt age 
Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser 
1600 1605 1610 



17511 



gtc ctt ata gee agg gac cag ett gaa cag gee etc gat tgc gag ate 
Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 



17559 



tac ggg gee tgc tac tec ata gaa cca etg gat eta cct cca ate att 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 



17607 



caa aga etc cat ggc etc age gca ttt tea etc cac agt tac tot cca 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 



17655 



ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ett ggg gta ccg 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 



17703 



ccc ttg cga get tgg aga cac egg gee egg age gtc cge get agg ctt 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 



17751 



etg gee aga gga ggc agg get gcc ata tgt ggc aag tac etc ttc aac 
Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 



17799 



tgg gca gta aga aca aag etc aaa etc act cca ata gcg gcc get ggc 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 



17847 



cag etg gac ttg tec ggc tgg ttc acg get ggc tac age ggg gga gac 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 



17895 



att tat cac age gtg tet cat gcc egg ccc cge tgg ate tgg ttt tgc 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 



17943 



eta etc etg ctt get gca ggg gta ggc ate tac etc etc ccc aac cga 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 



17991 
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atg age acg aat cct aaa cct caa aga aag acc aaa cgt aac acc aac 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 



18039 



egg egg ccg cag gac gtc aag ttc ccg ggt ggc ggt cag ate gtt ggt 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 



18087 



gga gtt tac ttg ttg ccg cgc agg ggc cct aga ttg ggt gtg cgc gcg 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 



18135 



acg aga aag act tec gag egg teg caa cct cga ggt aga cgt cag cct 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 



18183 



ate cec aag get cgt egg ecc gag ggc agg ace tgg get cag ccc ggg 
lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 



18231 



tac cct tgg ccc etc tat ggc aat gag ggc tge ggg tgg gcg gga tgg 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 



18279 



etc ctg tet ccc cgt ggc tet egg cct age tgg ggc ccc aca gac ccc 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 



18327 



egg cgt agg teg cgc aat ttg ggt aag taatagtcga etttgttcec 
Arg Arg Arg Ser Arg Asn Leu Gly Lys 
1885 1890 



18374 



actgtacttt 


tagctegtae 


aaaatacaat 


atacttttca 


tttctccgta 


aacaacatgt 


18434 


tttcccatgt 


aatatccttt 


tctatttttc 


gttccgttac 


caactttaca 


catactttat 


18494 


atagctattc 


acttctatac 


actaaaaaac 


taagacaatt 


ttaattttgc 


tgcctgccat 


18554 


atttcaattt 


gttataaatt 


cctataattt 


atcctattag 


tagctaaaaa 


aagatgaatg 


18614 


tgaatcgaat 


cetaagagaa 


ttggatetga 


tecacaggae 


gggtgtggtc 


gecatgateg 


18674 


cgtagtcgat 


^gtggctcca 


agtagcgaag 


egagcaggae 


tgggeggegg 


ccaaagcggt 


18734 


cggacagtgc 


tecgagaacg 


ggtgcgcata 


gaaattgcat 


caacgcatat 


agcgctagca 


18794 


gcacgccata 


gtgactggcg 


atgctgtcgg 


aatggacgat 


atcccgcaag 


aggcccggca 


18854 


gtaccggcat 


aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


aggatgacga 


18914 


tgagcgcatt 


gttagatttc 


atacacggtg 


cctgactgcg 


ttagcaattt 


aactgtgata 


18974 


aactaccgca 


ttaaagcttt 


ttctttccaa 


tttttttttt 


ttegtcatta 


taaaaatcat 


19034 


tacgaecgag 


attccegggt 


aataactgat 


ataattaaat 


tgaagctcta 


atttgtgagt 


19094 


ttagtataca 


tgcatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


catcttctca 


19154 


aatatgcttc 


ccagcctgct 


tttctgtaac 


gttcaccctc 


taccttagca 


tcccttccct 


19214 
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ttgcaaatag 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


acatcatcca 


19274 


cggttctata 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


ccgggtgtca 


19334 


taatcaacca 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagca 


ataaagccga 


19394 


taacaaaatc 


tttgtcgctc 


ttcgcaatgt 


caacagtacc 


cttagtatat 


tctccagtag 


19454 


atagggagcc 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


tcctttgtta 


19514 


cttcttctgc 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca 


ccgtgtgcat 


19574 


tcgtaatgtc 


tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


aatttgactg 


19634 


tattaccaat 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


ttggcggata 


19694 


atgcctttag 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


tccacatgtg 


19754 


tttttagtaa 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


tccttggtgg 


19814 


tacgaacatc 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


ttaaatagct 


19874 


tggcagcaac 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


ttcgacatga 


19934 


tttatcttcg 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


actgggcaat 


19994 


ttcatgtttc 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


tgctccttcc 


20054 


ttcgttcttc 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga 


aaccgaaatc 


20114 


aaaaaaaaga 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 




20160 



<210> 13 
<211> 1892 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.delta.NS3NS5.pj .corel21 

<400> 13 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
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85 



90 



95 



Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val 
130 135 140 

lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 

245 250 255 

Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 



Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 
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Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 

420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 



Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 



104 



wo 01/38360 



PCTA:S00/32326 



His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys 

740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 
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Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1065 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 
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Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 
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Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1655 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys 
1890 



<210> 14 
<211> 20316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.delta.NS3NS5.pj .corel73 

<220> 
<221> CDS 

<222> (12679) . . (18510) 
<400> 14 

atcgatccta ccccttgcgc taaagaagta tatgtgccta ctaacgcttg tctttgtctc 60 
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tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 
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caagaggagc atataaatta caaaacacca 
aaaaaaggag agtagcaacg agggtaaggg 
gatccaatat caaaggaaat gatagcattg 
cagcatatag aacagctaaa gggtagtgct 
gggataatat cacaggaggt actagactac 
gtacgcattt aagcataaac acgcactatg 
caacacgcag atataggtgc gacgtgaaca 
ttttcggaag cgctcgtttt cggaaacgct 
ctagaaagta taggaacttc agagcgcttt 
ttcaaaaaac caaaaacgca ccggactgta 
tccacaaaca ttgctcaaaa gtatctcttt 
aacctaccca tccacctttc gctccttgaa 
aggcttccaa tgctcttcaa attttactgt 
ctcttcataa tgtaagctta tctttatcga 
ctttacggtt ccctgagatt gaattagttc 
ctttgtacga cgaattttga ggttcgccat 
tattatctcc gcctcagttt gatcttccgc 
tatttcaccc cacaatcctt catccgcctc 
atgttgtaca ttgtttagtt cacgagaagg 
tatatgacct ttatcctgtt ctctttccac 
gcacctaata acattcttca aggcggagaa 
tgaaaacgtg agaatgaatt tagtattatt 
tcgaagataa gagaagaatg cagtgacctt 
aaaaaatacg cctttaggcc ttctgatacc 
attaatatct aaaccctctc cgatggtggc 
aaactgtgat aattctgggt gatttatgat 
aggatcaggc caatccagtt ctttttcaat 
tccaacaaat gcaaatgcta acgttttgta 
cccccttgtc gtctcgatta cacacctact 
cataatacat tgcttaatac aagcaagcag 



tcactgaggg ccctaaagcg gttcccacga 1920 
gcagaaaatc acgtaatact tctagggtat 1980 
aaggatgaga ctaatccaat tgaggagtgg 2040 
gaaggaagca tacgataccc cgcatggaat 2100 
ctttcatcct acataaatag acgcatataa 2160 
ccgttcttct catgtatata tatatacagg 2220 
gtgagctgta tgtgcgcagc tcgcgttgca 2280 
ttgaagttcc tattccgaag ttcctattct 2340 
tgaaaaccaa aagcgctctg aagacgcact 24 00 
acgagctact aaaatattgc gaataccgct 2460 
gctatatatc tctgtgctat atccctatat 2520 
cttgcatcta aactcgacct ctacatcaac 2580 
caagtagacc catacggctg taatatgctg 2640 
atcgtgtgaa aaactactac cgcgataaac 2700 
ctttagtata tgatacaaga cacttttgaa 2760 
cctctggcta tttccaatta tcctgtcggc 2820 
ttcagactgc catttttcac ataatgaatc 2880 
cgcatcttgt tccgttaaac tattgacttc 2940 
gtcctcttca ggcggtagct cctgatctcc 3000 
aaacttagaa atgtattcat gaattatgga 3060 
gtttgggcca gatgcccaat atgcttgaca 312 0 
gtgatattct gaggcaattt tattataatc 3180 
tgtattgaca aatggagatt ccatgtatct 3240 
ctttcccctg cggtttagcg tgccttttac 3300 
ctttaactga ctaataaatg caaccgatat 3360 
tcgatcgaca attgtattgt acactagtgc 3420 
taccggtgtg tcgtctgtat tcagtacatg 3480 
tttcttataa ttgtcaggaa ctggaaaagt 354 0 
ttcatcgtac accataggtt ggaagtgctg 3600 
tctctcgcca ttcatatttc agttattttc 3660 
110 



wo 01/38360 



PCTAJSOO/32326 



cattacagct gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 
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aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 
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cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


Ccttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 
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tggctgctgc 


cagtggcgat 


aagtcgtgtc ttaccgggtt ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc 


agcttggagc 


9180 


gaacgaccta 




a<5a ♦* ar'n't" ai^ art<^Qt"oaar»h" afrrartaaanr* 
dyduciw^uaw ciy ^y uy ovjVr U dl^uaMclcaaM^ 


gccacgcttc 


9240 


ccgaagggag 


ci CI a«j ^ \» CI w 


aoa^a ^ f faa ^aaof*oafan «nt"r»r»rraa/'a 
ciyy i-^v-yy uaetywyy^-eiy yycwyycici^a 


ggagagcgca 


9300 


cgagggagct 


fr" 31 rrrrrt rt fT a 
u C ^ cig^^vj 3 A 


Skskf^af^f*^anir af*r*t*h'^a^arT h' r«r«t*n^ f^rrrm 
ciciiM>yw^uyyu oLiVirWuciuciy wv^cuyccyyy 


tttcgccacc 


9360 


tctgacttga 


y CL \m \m w 


ui.y uyauyu ^y^^dyyyyy ycy^ciyCv- t-d 


tggaaaaacg 


9420 


ccagcaacgc 


^y Ui> u U i>Ci 


w>yyuu^^i>yy wwui* i»t>y wi.y yccbuuuycu 


cacatgttct 


9480 


ttcctgcgtt 




^^u-yvyycicci ciiMwy i»oi« i«dk« wyi»wL»i»L«ydy 


tgagctgata 


9540 


ccgctcgccg 




arT'oao^^nr'a orTiart^i^anf" rtart/^rtar^rtaa 
d^\^ydywy<Md y v^y dy L.<Mdy u ^dywydyydd 


gcggaagagc 


9600 


gcctgatgcg 




\vUWd^y^di.^ uy uywyy udu i*Uwdwdcc^c 


atatggtgca 


9660 


ctctcagtac 




SdUywwyCdVp d^uUddyCCd yUdi.dCdCl*C 


cgctatcgct 


9720 


acgtgactgg 




^n^f^^^na^s ^^^M^^aa^a ^^/^^v^^^^a^M 

cywcccydCd cccgccddCd cccyCugacg 


cgccctgacg 


9780 


ggcttgtctg 




^^n^f'^a^an a^aan/^^n^n a ^ f^rt ^ ^ ^ f^^/*v 
CCywuudCdy dCdayCUguy dCCyuCCCCg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa acgcgcgagg cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc tgcctgttca tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct gataaagcgg gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta agggggattt ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat acgggttact gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta aacaactggc ggtatggatg cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa tacagatgta ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat aatggtgcag ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac cattcatgtt gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc gcgtatcggt gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta gccgggtcct caacgacagg agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc ccgagatgcg ccgcgtgcgg ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag ggttggtttg cgcattcaca gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt agcgaggtgc cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc aacgcgggga ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat gtgctcgccg aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg ctggtaagag ccgcgagcga 


tccttgaagc 


10860 
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tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagaCatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12600 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12660 
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acaagcttac aaaacaaa atg get gca tat gca get cag gge tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta ete aae ccc tet gtt get gca aca ctg gge ttt ggt get tac 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get eat ggg ate gat cet aae ate agg aee ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att ace act gge age eee ate acg tac tec acc tac gge aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ett gee gae gge ggg tge teg ggg gge get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
60 65 70 75 

gac gag tge cac tec acg gat gee aca tec ate ttg gge att gge act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

acc gee acc ect ecg gge tec gtc act gtg ccc cat ccc aae ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec aee acc gga gag ate cet ttt tac gge aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tge gac gaa etc gee gca aag ctg gtc gca ttg gge 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gee gtg gee tac tac cgc ggt ctt gae gtg tee gtc ate ccg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

acc age gge gat gtt gtc gtc gtg gca acc gat gcc etc atg ace gge 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat ace gge gac ttc gac teg gtg ata gac tge aat acg tgt gtc acc 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ett gae cet ace ttc ace att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 
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acg etc ccc caa gat get gtc tec cgc act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cea ggc ate tac aga ttt gtg gea ccg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ccc tec ggc atg ttc gae teg tec gtc etc tgt gag tgc tat gae gea 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg ccc gee gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga geg tac atg aac ace ccg ggg ctt ccc gtg tgc cag gae cat ett 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gee cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aac ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gee ace gtg tgc get agg get caa gee cct ccc eca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gae cag atg tgg aag tgt ttg att cgc etc aag ccc acc etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cea aca ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

ace ctg acg cac cea gtc acc aaa tac ate atg aca tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala 
400 405 410 

gae ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gee geg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 
430 435 440 

ggc agg gtc gtc ttg tec ggg aag ccg gea ate ata cct gae agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag ttc gat gag atg gaa gag tgc tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 



117 



wo 01/58360 



PCTAJSOO/32326 



ccg tac ate gag caa ggg atg atg etc gcc gag cag ttc aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc gge etc ctg cag ace gcg tec cgt cag gea gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg cct ggt aac cec gee att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Net Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age cca eta ace act age caa ace etc etc ttc aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gcc gcc ecc ggt gcc get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg ggc get ggc tta get ggc gcc gee ate ggc agt gtt gga 144 87 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gae ate ett gea ggg tat ggc gcg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

9cg gga get ett gtg gea ttc aag ate atg age ggt gag gtc cec tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gae ctg gtc aat eta ctg ecc gcc ate etc teg cec gga gee 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc gge gtg gtc tgt gea gea ata ctg cgc egg eac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gea gtg cag tgg atg aac egg ctg ata gcc ttc gcc 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tee cec acg eac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 

685 690 695 

gea get gcc cgc gtc act gcc ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 
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etc ctg agg cga ctg cac cag tgg ata age teg gag tgt acc act cca 14871 
Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val 
735 740 745 

ttg age gac ttt aag acc tgg eta aaa get aag etc atg cca cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

ect ggg ate cce ttt gtg tee tge eag egc ggg tat aag ggg gte tgg 15015 
Pro Gly lie Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

ega ggg gac gge ate atg eae act egc tge eae tgt gga get gag ate 15063 
Arg Gly Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie 
780 785 790 795 

act gga cat gtc aaa aac ggg aeg atg agg ate gtc ggt cct agg acc 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr 
800 805 810 

tge agg aac atg tgg agt ggg acc ttc ccc att aat gee tae acc aeg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

gge ece tgt ace ccc ett eet geg ceg aac tae aeg tte geg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tae gtg gag ata agg cag gtg ggg gac ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tae gtg aeg ggt atg act act gac aat ett aaa tge ceg tgc eag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cca teg eee gaa ttt ttc aca gaa ttg gac ggg gtg egc eta eat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 

agg ttt geg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cac gaa tae ceg gta ggg teg caa tta cct tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 

cce gaa ceg gac gtg gee gtg ttg aeg tee atg etc act gat ccc tec 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 

eat ata aca gca gag geg gee ggg ega agg ttg geg agg gga tea ccc 15543 
His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 
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ccc tct gtg gcc age tec teg get age cag eta tee get cca tct etc 15591 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 

aag gea act tgc ace get aac eat gae tee eet gat get gag etc ata 15639 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie 
975 980 985 

gag gee aac etc eta tgg agg eag gag atg gge gge aae ate ace agg 15687 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 

gtt gag tea gaa aae aaa gtg gtg att etg gae tee tte gat ccg ett 15735 
Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 

gtg gcg gag gag gae gag egg gag ate tec gta cce gea gaa ate ctg 15783 
Val Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu 
1020 1025 1030 1035 

egg aag tct egg aga tte gee eag gee etg cce gtt tgg gcg egg ccg 15831 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 

gae tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gae tae gaa 15879 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 

cca eet gtg gte cat gge tgc ccg ctt cca ect cca aag tec eet eet 15927 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 

gtg eet ccg ect egg aag aag egg acg gtg gte etc act gaa tea ace 15975 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 

eta tct act gee ttg gcc gag etc gcc ace aga age ttt gge age tec 16023 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 

tea act tee gge att acg gge gae aat acg aea aca tec tct gag ccc 16071 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 

gee eet tct gge tgc ccc ccc gae tee gae get gag tee tat tec tec 16119 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 

atg ccc ccc ctg gag ggg gag ect ggg gat ccg gat ctt age gae ggg 16167 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 

tea tgg tea acg gte agt agt gag gee aae gcg gag gat gte gtg tgc 16215 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 

tgc tea atg tct tae tct tgg aea gge gea etc gte ace ccg tgc gee 16263 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 
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gc9 gaa gaa cag aaa ctg ccc ate aat gca eta age aae teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



cgt cac cac aat ttg gtg tat tec acc ace tea cge agt get tge eaa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg cag aag aaa gtc aca ttt gae aga ctg eaa gtt ctg gac age eat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tac cag gac gta etc aag gag gtt aaa gca gcg geg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 



get aac ttg eta tec gta gag gaa get tgc age ctg acg ccc cea cac 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 



16503 



tea gee aaa tec aag ttt ggt tat ggg gca aaa gae gtc cgt tgc cat 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 



16551 



gee aga aag gee gta acc cac ate aac tec gtg tgg aaa gae ett ctg 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gac aat gta aca cea ata gac act ace ate atg get aag aae gag 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cet gag aag ggg ggt cgt aag cea get cgt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg ttc ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gac gtg gtt aca aag etc ccc ttg gee gtg atg gga age tee tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 



gga ttc eaa tac tea cea gga cag egg gtt gaa ttc etc gtg eaa geg 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 



16839 



tgg aag tec aag aaa acc cea atg ggg ttc teg tat gat ace cgc tge 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 



16887 



ttt gac tec aca gtc act gag age gae ate cgt acg gag gag gca ate 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
1405 1410 1415 



16935 



tac caa tgt tgt gae etc gae ccc eaa gee cgc gtg gee ate aag tee 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 



16983 
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etc acc gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 



17031 



gag aac tgc ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 



17079 



age tgt ggt aac acc etc act tgc tac ate aag gee egg gea gee tgt 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
1470 1475 1480 



17127 



cga gcc gca ggg etc eag gac tgc acc atg etc gtg tgt ggc gac gae 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1465 1490 1495 



17175 



tta gte gtt ate tgt gaa age gcg ggg gtc eag gag gae gcg gcg age 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 



17223 



ctg aga gcc ttc acg gag get atg acc agg tac tec gcc cec cct ggg 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 



17271 



gac ccc cea caa cca gaa tac gae ttg gag etc ata aca tea tgc tee 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1535 1540 1545 



17319 



tec aac gtg tea gtc gcc cac gac ggc get gga aag agg gtc tac tac 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 



17367 



etc acc cgt gae cct aca acc ccc etc gcg aga get gcg tgg gag aca 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 



17415 



gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe 
1580 1585 1590 1595 



17463 



gcc cec aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 
Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser 
1600 1605 1610 



17511 



gtc ctt ata gcc agg gac cag ctt gaa eag gcc etc gat tgc gag ate 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 



17559 



tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cea ate att 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 



17607 



caa aga etc cat ggc etc age gca ttt tea etc cac agt tac tct cca 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 



17655 



ggt gaa ate aat agg gtg gee gca tgc etc aga aaa ctt ggg gta ecg 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 



17703 
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ccc ttg cga get tgg aga cac egg gcc egg age gte egc get agg ctt 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

etg gee aga gga gge agg get gee ata tgt gge aag tae cte tte aae 17799 
Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aca aag etc aaa cte act eea ata geg gee get gge 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly 
1710 1715 1720 

eag etg gac ttg tec gge tgg tte acg get gge tae age ggg gga gac 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat cac age gtg tct eat gee egg ece egc tgg ate tgg ttt tge 17943 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 

eta etc etg ctt get gca ggg gta gge ate tae etc etc ccc aac cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

atg age acg aat eet aaa ect caa aga aag ace aaa cgt aae ace aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg ccg cag gae gte aag tte ccg ggt gge ggt eag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ccg egc agg gge ect aga ttg ggt gtg egc geg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

aeg aga aag act tec gag egg teg caa ect cga ggt aga cgt cag ect 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ccc aag get cgt egg ccc gag gge agg ace tgg get cag ccc ggg 18231 
lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 

tac ect tgg ccc etc tat gge aat gag gge tge ggg tgg geg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc etg tct ccc cgt gge tct egg eet age tgg gge ece aca gac ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg egt agg teg egc aat ttg ggt aag gte ate gat acc ctt acg tge 18375 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
1885 1890 1895 

gge tte gcc gae cte atg ggg tac ata ccg cte gte gge gee ect ctt 18423 
Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
1900 1905 1910 1915 
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gga ggc get gcc agg gcc ctg gcg cat ggc gtc egg gtt ctg gaa gac 18471 
Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
1920 1925 1930 

ggc gtg aac tat gca aca ggg aac ctt cct ggt tgc tct taatagtcga 18520 
Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser 
1935 1940 

ctttgttccc actgtacttt tagctcgtac aaaatacaat atacttttca tttctccgta 18580 

aacaacatgt tttcccatgt aatatccttt tctatttttc gttccgttac caactttaca 18640 

catactttat atagctattc acttctatac actaaaaaac taagacaatt ttaattttgc 18700 

tgcctgccat atttcaattt gttataaatt cctataattt atcctattag tagctaaaaa 18760 

aagatgaatg tgaatcgaat cctaagagaa ttggatctga tccacaggac gggtgtggtc 18820 

gccatgatcg cgtagtcgat agtggctcca agtagcgaag cgagcaggac tgggcggcgg 18880 

ccaaagcggt cggacagtgc tccgagaacg ggtgcgcata gaaattgcat caacgcatat 18940 

agcgctagca gcacgccata gtgactggcg atgctgtcgg aatggacgat atcccgcaag 19000 

aggcccggca gtaccggcat aaccaagcct atgcctacag catccagggt gacggtgccg 19060 

aggatgacga tgagcgcatt gttagatttc atacacggtg cctgactgcg ttagcaattt 19120 

aactgtgata aactaccgca ttaaagcttt ttctttccaa tttttttttt ttcgtcatta 19180 

taaaaatcat tacgaccgag attcccgggt aataactgat ataattaaat tgaagctcta 19240 

atttgtgagt ttagtataca tgcatttact tataatacag ttttttagtt ttgctggccg 19300 

catcttctca aatatgcttc ccagcctgct tttctgtaac gttcaccctc taccttagca 19360 

tcccttccct ttgcaaatag tcctcttcca acaataataa tgtcagatcc tgtagagacc 19420 

acatcatcca cggttctata ctgttgaccc aatgcgtctc ccttgtcatc taaacccaca 19480 

ccgggtgtca taatcaacca atcgtaacct tcatctcttc cacccatgtc tctttgagca 1954 0 

ataaagccga taacaaaatc tttgtcgctc ttcgcaatgt caacagtacc cttagtatat 19600 

tctccagtag atagggagcc cttgcatgac aattctgcta acatcaaaag gcctctaggt 19660 

tcctttgtta cttcttctgc cgcctgcttc aaaccgctaa caatacctgg gcccaccaca 19720 

ccgtgtgcat tcgtaatgtc tgcccattct gctattctgt atacacccgc agagtactgc 19780 

aatttgactg tattaccaat gtcagcaaat tttctgtctt cgaagagtaa aaaattgtac 19840 

ttggcggata atgcctttag cggcttaact gtgccctcca tggaaaaatc agtcaagata 19900 

tccacatgtg tttttagtaa acaaattttg ggacctaatg cttcaactaa ctccagtaat 19960 

tccttggtgg tacgaacatc caatgaagca cacaagtttg tttgcttttc gtgcatgata 20020 

ttaaatagct tggcagcaac aggactagga tgagtagcag cacgttcctt atatgtagct 20080 
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ttcgacatga tttatcttcg tttcctgcag gtttttgttc tgtgcagttg ggttaagaat 20140 

actgggcaat ttcatgtttc ttcaacacta catatgcgta tatataccaa tctaagtctg 20200 

tgctccttcc ttcgttcttc cttctgttcg gagattaccg aatcaaaaaa atttcaagga 20260 

aaccgaaatc aaaaaaaaga ataaaaaaaa aatgatgaat tgaaaagctt atcgat 20316 



<210> 15 
<211> 1944 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd . delta . NS3NS5 . p j . corel73 

<400> 15 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
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210 



215 



220 



Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 



Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 
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lie Ser Gly He Gin Tyr Leu Ala 
530 535 

Pro Ala He Ala Ser Leu Met Ala 
545 550 

Leu Thr Thr Ser Gin Thr Leu Leu 
565 



Gly Leu Ser Thr Leu Pro Gly Asn 
540 

Phe Thr Ala Ala Val Thr Ser Pro 
555 560 

Phe Asn He Leu Gly Gly Trp Val 
570 575 



Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 

610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 
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Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 
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Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 
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Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 
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Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
905 1910 1915 1920 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
1925 1930 1935 

Thr Gly Asn Leu Pro Gly Cys Ser 
1940 



<210> 16 
<211> 20217 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd. delta. NS3NS5.pj .corel40 

<220> 
<221> CDS 

<222> (12679) . . (18411) 
<400> 16 



atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 
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atagctcatg aatgtggctc tcttgattgc 
ataggttagt tcagcagcac ataatgctat 
cacaaactga cgaacaagca ccttaggtgg 
gcttagcgcc gatcttgtgt gcaattgata 
tcttgcagta ttcaaacacg ctaactcgaa 
cgttctttcg aaaaatgcac cggccgcgca 
cggtatcttc atttcatatt ttaaaaatgc 
ggcccgcagg ttcgttttgc ggtactatct 
tgcacagatt ttataatgta ataagcaaga 
agaaaaccaa aatggacgac attgaaacag 
cttatagcgt ctgggatgta tgtcggctgt 
ttgatataga gagtaaacgt aagtctgatg 
ccatggaatc tctcacaacc ggtaggccgt 
gcgtatcttc tgactccagt gctgaggtaa 
ggtttgattc gattggaaat ggtatgctct 
atttgatgct acagaataac aagctgttag 
ctataataat aggaagattg cccgagaaag 
gaaaaatgga ttgtacacag ttattagtcc 
agctcgtaag cgtcgttacc caattgctta 
taataggtga tttattcatc ccggaatctc 
tggcggcaga gaatcgttta cagcaaaaaa 
accatgctaa tacaaatgaa gaagttccct 
caagaggagc atataaatta caaaacacca 
aaaaaaggag agtagcaacg agggtaaggg 
gatccaatat caaaggaaat gatagcattg 
cagcatatag aacagctaaa gggtagtgct 
gggataatat cacaggaggt actagactac 
gtacgcattt aagcataaac acgcactatg 
caacacgcag atataggtgc gacgtgaaca 
ttttcggaag cgctcgtttt cggaaacgct 



tgttccgtta tgtgtaatca tccaacataa 600 
tttctcacct gaaggtcttt caaacctttc 660 
tgttttacat aatatatcaa attgtggcat 720 
tctagtttca actactctat ttatcttgta 780 
aaactaactt taattgtcct gtttgtctcg 840 
ttatttgtac tgcgaaaata attggtactg 900 
acctttgctg cttttcctta atttttagac 960 
tgtgataaaa agttgttttg acatgtgatc 1020 
atacattatc aaacgaacaa tactggtaaa 1080 
ccaagaatct gacggtaaaa gcacgtacag 1140 
ttattgaaat gattgctcct gatgtagata 1200 
agctactctt tccaggatat gtcataaggc 1260 
atggtcttga ttctagcgca gaagattcca 1320 
ttttgcctgc tgcgaagatg gttaaggaaa 1380 
cttcacaaga agcaagtcag gctgccatag 1440 
acaatagaaa gcaactatac aaatctattg 1500 
acaagaagag agctaccgaa atgctcatga 1560 
caccagctcc aacggaagaa gatgttatga 1620 
ctttagttcc accagatcgt caagctgctt 1680 
taaaggatat attcaatagt ttcaatgaac 1740 
agagtgagtt ggaaggaagg actgaagtga 1800 
ccaggcgaac aagaagtaga gacacaaatg 1860 
tcactgaggg ccctaaagcg gttcccacga 1920 
gcagaaaatc acgtaatact tctagggtat 1980 
aaggatgaga ctaatccaat tgaggagtgg 2040 
gaaggaagca tacgataccc cgcatggaat 2100 
ctttcatcct acataaatag acgcatataa 2160 
ccgttcttct catgtatata tatatacagg 2220 
gtgagctgta tgtgcgcagc tcgcgttgca 2280 
ttgaagttcc tattccgaag ttcctattct 2340 
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ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 
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ttgttactct attgatccag ctcagcaaag 
tgtagtaaaa ctagctagac cgagaaagag 
gctgccatca ttattatccg atgtgacgct 
tttttttttt tttttttttt ttttttggta 
agcaaggatt ttcttaactt cttcggcgac 
accacctaaa tcaccagttc tgatacctgc 
ggctttacct tcttcaggca agttcaatga 
agtggcgata gggttgacct tattctttgg 
gtacaaacca aatgcggtgt tcttgtctgg 
acccaaggag cctgggataa cggaggcttc 
ggtgattata ataccattta ggtgggttgg 
aatcaattga tgttgaactt tcaatgtagg 
ttttctccat aatcttgaag aggccaaaac 
^ggtggctca tgttgtaggg ccatgaaagc 
aacggtgtat tgttcactat cccaagcgac 
aaagtaaata cctcccacta attctctaac 
tggcttgatt ggagataagt ctaaaagaga 
ggcgtacaat tgaagttctt tacggatttt 
ggtaccccat ttaggaccac ccacagcacc 
ttccagcgcc tcatctggaa gtggaacacc 
atgattttcg aaatcgaact tgacattgga 
aatggcttcg gctgtgattt cttgaccaac 
agg99cagac attacaatgg tatatccttg 
aaaaaaaaaa atgcagcttc tcaatgatat 
tatccgacaa actgttttac agatttacga 
acatccgaac ctgggagttt tccctgaaac 
tatagtctag cgctttacgg aagacaatgt 
atctattgca taggtaatct tgcacgtcgc 
tgcacttcaa tagcatatct ttgttaacga 
atgcaacgcg agagcgctaa tttttcaaac 



gcagtgtgat ctaagattct atcttcgcga 4200 
actagaaatg caaaaggcac ttctacaatg 4260 
gcattttttt tttttttttt tttttttttt 4320 
caaatatcat aaaaaaagag aatcttttta 4380 
agcatcaccg acttcggtgg tactgttgga 4440 
atccaaaacc tttttaactg catcttcaat 4500 
caatttcaac atcattgcag cagacaagat 4560 
caaatctgga gcggaaccat ggcatggttc 4620 
caaagaggcc aaggacgcag atggcaacaa 4 680 
atcggagatg atatcaccaa acatgttgct 4740 
gttcttaact aggatcatgg cggcagaatc 4800 
gaattcgttc ttgatggttt cctccacagt 4860 
attagcttta tccaaggacc aaataggcaa 4920 
ggccattctt gtgattcttt gcacttctgg 4980 
accatcacca tcgtcttcct ttctcttacc 5040 
aacaacgaag tcagtacctt tagcaaattg 5100 
gtcggatgca aagttacatg gtcttaagtt 5160 
tagtaaacct tgttcaggtc taacactacc 5220 
taacaaaacg gcatcagcct tcttggaggc 5280 
tgtagcatcg atagcagcac caccaattaa 534 0 
acgaacatca gaaatagctt taagaacctt 5400 
gtggtcacct ggcaaaacga cgatcttctt 5460 
aaatatatat aaaaaaaaaa aaaaaaaaaa 5520 
tcgaatacgc tttgaggaga tacagcctaa 5580 
tcgtacttgt tacccatcat tgaattttga 5640 
agatagtata tttgaacctg tataataata 5700 
atgtatttcg gttcctggag aaactattgc 5760 
atccccggtt cattttctgc gtttccatct 5820 
agcatctgtg cttcattttg tagaacaaaa 5880 
aaagaatctg agctgcattt ttacagaaca 5940 
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gaaatgcaac gcgaaagcgc tattttacca acgaagaatc tgtgcttcat ttttgtaaaa 6000 
caaaaatgca acgcgagagc gctaattttt caaacaaaga atctgagctg catttttaca 6060 
gaacagaaat gcaacgcgag agcgctattt taccaacaaa gaatctatac ttcttttttg 6120 
ttctacaaaa atgcatcccg agagcgctat ttttctaaca aagcatctta gattactttt 6180 
tttctccttt gtgcgctcta taatgcagtc tcttgataac tttttgcact gtaggtccgt 6240 
taaggttaga agaaggctac tttggtgtct attttctctt ccataaaaaa agcctgactc 6300 
cacttcccgc gtttactgat tactagcgaa gctgcgggtg cattttttca agataaaggc 6360 
atccccgatt atattctata ccgatgtgga ttgcgcatac tttgtgaaca gaaagtgata 6420 
gcgttgatga ttcttcattg gtcagaaaat tatgaacggt ttcttctatt ttgtctctat 6480 
atactacgta taggaaatgt ttacattttc gtattgtttt cgattcactc tatgaatagt 6540 
tcttactaca atttttttgt ctaaagagta atactagaga taaacataaa aaatgtagag 6600 
gtcgagttta gatgcaagtt caaggagcga aaggtggatg ggtaggttat atagggatat 6660 
agcacagaga tatatagcaa agagatactt ttgagcaatg tttgtggaag cggtattcgc 6720 
aatattttag tagctcgtta cagtccggtg cgtttttggt tttttgaaag tgcgtcttca 6780 
gagcgctttt ggttttcaaa agcgctctga agttcctata ctttctagag aataggaact 6 840 
tcggaatagg aacttcaaag cgtttccgaa aacgagcgct tccgaaaatg caacgcgagc 6900 
tgcgcacata cagctcactg ttcacgtcgc acctatatct gcgtgttgcc tgtatatata 6960 
tatacatgag aagaacggca tagtgcgtgt ttatgcttaa atgcgtactt atatgcgtct 7020 
aCttatgtag gatgaaaggt agtctagtac ctcctgtgat attatcccat tccatgcggg 7080 
gtatcgtatg cttccttcag cactaccctt tagctgttct atatgctgcc actcctcaat 7140 
tggattagtc tcatccttca atgctatcat ttcctttgat attggatcat atgcatagta 7200 
ccgagaaact agtgcgaagt agtgatcagg tattgctgtt atctgatgag tatacgttgt 7260 
cctggccacg gcagaagcac gcttatcgct ccaatttccc acaacattag tcaactccgt 7320 
taggcccttc attgaaagaa atgaggtcat caaatgtctt ccaatgtgag attttgggcc 7380 
attttttata gcaaagattg aataaggcgc atttttcttc aaagctttat tgtacgatct 7440 
gactaagtta tcttttaata attggtattc ctgtttattg cttgaagaat tgccggtcct 7500 
atttactcgt tttaggactg gttcagaatt cctcaaaaat tcatccaaat atacaagtgg 7560 
atcgatgata agctgtcaaa catgagaatt cttgaagacg aaagggcctc gtgatacgcc 7620 
tatttttata ggttaatgtc atgataataa tggtttctta gacgtcaggt ggcacttttc 7680 
ggggaaatgt gcgcggaacc cctatttgtt tatttttcta aatacattca aatatgtatc 7740 
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cgctcatgag acaataaccc tgataaatgc 
gtattcaaca tttccgtgtc gcccttattc 
ttgctcaccc agaaacgctg gtgaaagtaa 
tgggttacat cgaactggat ctcaacagcg 
aacgttttcc aatgatgagc acttttaaag 
ttgacgccgg gcaagagcaa ctcggtcgcc 
agtactcacc agtcacagaa aagcatctta 
gtgctgccat aaccatgagt gataacactg 
gaccgaagga gctaaccgct tttttgcaca 
gttgggaacc ggagctgaat gaagccatac 
cagcaatggc aacaacgttg cgcaaactat 
ggcaacaatt aatagactgg atggaggcgg 
cccttccggc tggctggttt attgctgata 
gtatcattgc agcactgggg ccagatggta 
cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 
gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 
ctggcttcag cagagcgcag ataccaaata 
accacttcaa gaactctgta gcaccgccta 
tggctgctgc cagtggcgat aagtcgtgtc 
cggataaggc gcagcggtcg ggctgaacgg 
gaacgaccta caccgaactg agatacctac 
ccgaagggag aaaggcggac aggtatccgg 
cgagggagct tccaggggga aacgcctggt 
tctgacttga gcgtcgattt ttgtgatgct 
ccagcaacgc ggccttttta cggttcctgg 
ttcctgcgtt atcccctgat tctgtggata 



ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


taactggcga 


actacttact 


ctagcttccc 


8400 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tttactcata 


tatactttag 


attgatttaa 


8700 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


atctttatag 


tcctgtcggg 


tttcgccacc 


9360 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


accgtattac 


cgcctttgag 


tgagctgata 


9540 
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ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


99ttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 
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ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12600 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12660 


acaagcttac 


aaaacaaa atg get gca i 


tat gca get 


cag ggc tat aag gtg 


12711 



Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 



eta gta etc aac ece tct gtt get gca aea etg ggc ttt ggt get tac 12759 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 

15 20 25 

atg tec aag get cat ggg ate gat ect aac ate agg ace ggg gtg aga 12807 

Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 

30 35 40 

aea att ace act ggc age ccc ate acg tac tec acc tac ggc aag ttc 12855 

Thr lie Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 
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ctt gcc gac ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
60 65 70 75 

gac gag tgc cac tec acg gat gee aca tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gae caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gcc 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

ace gcc aec ect eeg ggc tee gtc act gtg ecc cat cce aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec aec ace gga gag ate ect ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate cec etc gaa gta ate aag ggg ggg aga cat etc ate tte tgt eat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gcc gca aag ctg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gee gtg gee tac tac cgc ggt ctt gac gtg tec gtc ate ccg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

aec age ggc gat gtt gtc gtc gtg gca ace gat gcc etc atg aec ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat acc ggc gac ttc gac teg gtg ata gac tgc aat acg tgt gtc ace 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ctt gac ect acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 

acg etc cec caa gat get gtc tec cgc act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag eca ggc ate tac aga ttt gtg gca ccg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

cec tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg cec gee gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 
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C9a gcg tac atg aac acc ccg ggg ctt ccc gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gcc cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aac ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gcc acc gtg tgc get agg get caa gcc cct ccc cca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag ccc acc etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cca aca ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate 13663 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

acc ctg acg cac cca gtc acc aaa tac ate atg aca tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gcc gcg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
430 435 440 

ggc agg gtc gtc ttg tec ggg aag ccg gea ate ata cct gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag ttc gat gag atg gaa gag tgc tet cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gcc gag cag ttc aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 465 490 

gcc etc ggc etc ctg cag acc gcg tec cgt cag gea gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

eat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 
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acg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age eca eta acc act age eaa acc etc etc ttc aae ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gcc gcc ccc ggt gcc get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg gge get gge tta get gge gcc gcc ate gge agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gac ate ett gea ggg tat gge geg gge gtg 14535 
Leu Gly Lys Val Leu lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

gcg gga get ctt gtg gea ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc gge gtg gtc tgt gea gea ata ctg cge egg cac gtt gge 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg gge gag ggg gea gtg cag tgg atg aae egg ctg ata gcc ttc gcc 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gea get gee cge gtc act gee ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg cga ctg cac cag tgg ata age teg gag tgt acc act eca 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tee ggt tec tgg eta agg gac ate tgg gac tgg ata tge gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag acc tgg eta aaa get aag etc atg eca cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate ccc ttt gtg tec tgc cag cge ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 
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cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie 
780 785 790 795 

act gga cat gtc aaa aae ggg aeg atg agg ate gte ggt cet agg ace 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg acc ttc ccc att aat gee tae ace aeg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

ggc ccc tgt acc ccc ctt cet gcg ccg aac tae aeg ttc gcg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gea gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tac gtg aeg ggt atg act act gac aat ctt aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cea teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 B90 

agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cet tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 

ccc gaa ccg gac gtg gee gtg ttg aeg tec atg etc act gat ccc tec 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 

cat ata aca gea gag gcg gcc ggg cga agg ttg gcg agg gga tea ccc 15543 
His He Thr Ala Glu Ala Ala Gly Arg Arq Leu Ala Arg Gly Ser Pro 
940 945 950 955 

ccc tct gtg gcc age tec teg get age cag eta tec get cca tct etc 15591 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 

aag gea act tgc acc get aac cat gac tec cet gat get gag etc ata 15639 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He 
975 980 985 

gag gcc aac etc eta tgg agg cag gag atg ggc ggc aae ate ace agg 15687 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg 
990 995 1000 

gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 15735 
Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 
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gtg gcg gag gag gac gag egg gag ate tee gta ccc gca gaa ate ctg 15783 
Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 

egg aag tct egg aga ttc gee eag gee etg eee gtt tgg geg egg ccg 15831 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 

gae tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa 15879 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 

eea cct gtg gte cat gge tgc ccg ett cca cet cea aag tee ect cet 15927 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 

gtg cct ccg cct egg aag aag egg aeg gtg gte etc act gaa tea ace 15975 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 

eta tct act gee ttg gee gag etc gee acc aga age ttt ggc age tec 16023 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 

tea act tee gge att aeg ggc gae aat aeg aea aca tec tct gag ccc 16071 
Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 

gee cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tec 16119 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 

atg ccc ccc ctg gag ggg gag cct ggg gat ccg gat ett age gac ggg 16167 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 

tea tgg tea acg gte agt agt gag gee aac gcg gag gat gte gtg tgc 16215 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 

tgc tea atg tct tac tct tgg aca ggc gca etc gte acc ccg tgc gee 16263 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 

gcg gaa gaa cag aaa ctg ccc ate aat gca eta age aac teg ttg eta 16311 
Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 

egt eac cae aat ttg gtg tat tec ace acc tea cgc agt get tgc caa 16359 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 

agg cag aag aaa gte aca ttt gac aga ctg caa gtt ctg gac age cat 16407 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 

tac cag gac gta etc aag gag gtt aaa gca gcg geg tea aaa gtg aag 16455 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 
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get aac ttg eta tee gta gag gaa get tgc age etg aeg ecc eca cac 16503 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 

tea gcc aaa tec aag ttt ggt tat ggg gea aaa gac gtc cgt tgc cat 16551 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 

gcc aga aag gcc gta acc cac ate aac tec gtg tgg aaa gac ctt etg 16599 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 

gaa gac aat gta aca eca ata gac act ace ate atg get aag aac gag 16647 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 

gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag eca get cgt etc 16695 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 

ate gtg ttc ccc gat etg ggc gtg cgc gtg tgc gaa aag atg get ttg 16743 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 

tac gac gtg gtt aca aag etc ecc ttg gee gtg atg gga age tee tac 16791 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 

gga ttc caa tac tea eca gga cag egg gtt gaa ttc etc gtg caa gcg 16839 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 

tgg aag tec aag aaa acc eca atg ggg ttc teg tat gat acc cgc tgc 16887 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 

ttt gac tee aca gtc act gag age gae ate cgt aeg gag gag gea ate 16935 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
1405 1410 1415 

tac caa tgt tgt gac etc gac ccc caa gee cgc gtg gcc ate aag tec 16983 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 

etc acc gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 17031 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 

gag aac tgc ggc tat cgc agg tgc ege gcg age ggc gta etg aca act 17079 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 

age tgt ggt aac acc etc act tgc tac ate aag gcc egg gea gcc tgt 17127 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
1470 1475 1480 

ega gee gea ggg etc cag gae tgc acc atg etc gtg tgt ggc gac gac 17175 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 
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tta gtc gtt ate tgt gaa age gcg ggg gte cag gag gac gcg gcg age 17223 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 

ctg aga gee tte acg gag get atg ace agg tae tee gee eee eet ggg 17271 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 

gac ccc cca caa cca gaa tac gac ttg gag etc ata aca tea tgc tec 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1535 1540 1545 

tee aae gtg tea gte gee eac gae gge get gga aag agg gte tac tac 17367 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc ace egt gae eet aca aec eee etc gcg aga get gcg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 

gca aga eac act cca gtc aat tec tgg eta gge aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe 
1580 1585 1590 1595 

gee eee aea etg tgg gcg agg atg ata ctg atg ace eat tte ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 

gtc ett ata gee agg gac cag ett gaa cag gcc etc gat tgc gag ate 17559 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gcc tgc tac tec ata gaa cca ctg gat eta ect cca ate att 17607 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc cat gge etc age gca ttt tea etc eac agt tac tet cca 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 

ggt gaa ate aat agg gtg gee gca tgc etc aga aaa ett ggg gta ccg 17703 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

eee ttg ega get tgg aga eac egg gcc egg age gtc cgc get agg ett 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

ctg gcc aga gga gge agg get gee ata tgt gge aag tac etc tte aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aca aag etc aaa etc act cca ata gcg gee get gge 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gac ttg tec gge tgg tte acg get gge tae age ggg gga gac 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 
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att tat cac age gtg tct cat gcc egg eec cgc tgg ate tgg ttt tgc 17943 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gea ggg gta ggc ate tae ete etc eee aae cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

atg age acg aat cct aaa cot caa aga aag ace aaa egt aac ace aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg ccg cag gac gtc aag ttc ccg ggt ggc ggt cag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ccg cgc agg ggc cct aga ttg ggt gtg cgc geg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

acg aga aag act tec gag egg teg caa cct cga ggt aga cgt cag cct 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ecc aag get cgt egg cce gag ggc agg ace tgg get cag ccc ggg 18231 
He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 

tae cct tgg ccc etc tat ggc aat gag ggc tgc ggg tgg geg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tct ccc cgt ggc tct egg cct age tgg ggc ccc aca gac ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg cgt agg teg cgc aat ttg ggt aag gtc ate gat acc ctt acg tgc 18375 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
1885 1890 1895 

ggc ttc gcc gac etc atg ggg tac ata ccg etc gtc taatagtega 18421 
Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val 
1900 1905 1910 

ctttgttccc actgtaettt tagctcgtac aaaatacaat atacttttca tttctccgta 18481 

aacaacatgt tttcccatgt aatatccttt tctatttttc gttccgttac caactttaca 18541 

catactttat atagctattc acttctatac actaaaaaac taagacaatt ttaattttgc 18601 

tgcctgccat atttcaattt gttataaatt cctataattt atcctattag tagctaaaaa 18661 

aagatgaatg tgaatcgaat cetaagagaa ttggatctga tecacaggac gggtgtggtc 18721 

gecatgateg egtagtegat agtggctcca agtagcgaag egagcaggac tgggeggegg 18781 

ecaaageggt cggacagtgc tccgagaacg ggtgegcata gaaattgcat caacgcatat 18841 

agcgctagca gcacgccata gtgactggcg atgctgtcgg aatggacgat atcccgcaag 18901 
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aggcccggca 


gtaccggcat 


aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


18961 


aggatgacga 


tgagcgcatt 


gttagatttc 


atacacggtg 


cctgactgcg 


ttagcaattt 


19021 


aactgtgata 


aactaccgca 


ttaaagcttt 


ttctttccaa 


tttttttttt 


ttcgtcatta 


19081 


taaaaatcat 


tacgaccgag 


attcccgggt 


aataactgat 


ataattaaat 


tgaagctcta 


19141 


atttgtgagt 


ttagtataca 


tgcatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


19201 


caCcttctca 


aatatgcttc 


ccagcctgct 


tttctgtaac 


gttcaccctc 


taccttagca 


19261 


tcccttccct 


ttgcaaatag 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


19321 


acatcatcca 


cggttctata 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


19381 


ccgggtgtca 


taatcaacca 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagca 


19441 


ataaagccga 


taacaaaatc 


tttgtcgctc 


ttcgcaatgt 


caacagtacc 


cttagtatat 


19501 


tctccagtag 


atagggagcc 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


19561 


tcctttgtta 


cttcttctgc 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca 


19621 


ccgtgtgcat 


tcgtaatgtc 


tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


19681 


aatttgactg 


tattaccaat 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


19741 


ttggcggata 


atgcctttag 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


19801 


tccacatgtg 


tttttagtaa 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


19861 


tccttggtgg 


tacgaacatc 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


19921 


ttaaatagct 


tggcagcaac 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


19981 


ttcgacatga 


tttatcttcg 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


20041 


actgggcaat 


ttcatgtttc 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


20101 


tgctccttcc 


ttcgttcttc 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga 


20161 


aaccgaaatc 


aaaaaaaaga 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 


20217 



<210> 17 
<211> 1911 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd . delta . NS3NS5 . p j . corel4 0 

<400> 17 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 
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Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp lie He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
X30 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg* Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 



Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 
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Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Tip Asp Gin Met Trp 

355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 
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Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 



Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 
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Ala Asn His Asp Ser Pro Asp Ala 
980 

Trp Arg Gin Glu Met Gly Gly Asn 
995 1000 



Thr Phe Asp Arg Leu Gin Val Leu 
1235 1240 

Lys Glu Val Lys Ala Ala Ala Ser 
1250 1255 

Val Glu Glu Ala Cys Ser Leu Thr 
265 1270 

Phe Gly Tyr Gly Ala Lys Asp Val 
1285 



Glu Leu He Glu Ala Asn Leu Leu 
985 990 

He Thr Arg Val Glu Ser Glu Asn 
1005 

Glu Asp 



Arg Arg 
1040 

Pro Pro 
1055 

Val His 



Pro Arg 



Ala Leu 



Gly He 
1120 

Gly Cys 
1135 

Leu Glu 



Thr Val 



Ser Tyr 



Gin Lys 
1200 

Asn Leu 
1215 

Lys Val 



Asp Ser His Tyr Gin Asp Val Leu 
1245 

Lys Val Lys Ala Asn Leu Leu Ser 
1260 

Pro Pro His Ser Ala Lys Ser Lys 
1275 1280 

Arg Cys His Ala Arg Lys Ala Val 
1290 1295 



Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser 
025 1030 1035 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn 
1045 1050 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro* Pro Val 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser 
105 1110 1115 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser 
1125 1130 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu 
185 1190 1195 

Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His 
1205 1210 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys 
1220 1225 1230 



151 



wo 01/38360 



PCTAJSOO/32326 



Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 
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Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr He Pro Leu Val 
905 1910 
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<211> 20247 
<2I2> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd . delta .NS3NS5 . pj . corelSO 

<220> 
<221> CDS 

<222> (12679) . . (18441) 
<400> 18 



atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg gttaaggaaa 


1380 
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ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 
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tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 
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aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 
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gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 


cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 
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cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 
gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 
ctggcttcag cagagcgcag ataccaaata 
accacttcaa gaactctgta gcaccgccta 
tggctgctgc cagtggcgat aagtcgtgtc 
cggataaggc gcagcggtcg ggctgaacgg 
gaacgaccta caccgaactg agatacctac 
ccgaagggag aaaggcggac aggtatccgg 
cgagggagct tccaggggga aacgcctggt 
tctgacttga gcgtcgattt ttgtgatgct 
ccagcaacgc ggccttttta cggttcctgg 
ttcctgcgtt atcccctgat tctgtggata 
ccgctcgccg cagccgaacg accgagcgca 
gcctgatgcg gtattttctc cttacgcatc 
ctctcagtac aatctgctct gatgccgcat 
acgtgactgg gtcatggctg cgccccgaca 
ggcttgtctg ctcccggcat ccgcttacag 
gtgtcagagg ttttcaccgt catcaccgaa 
agcgtggtcg tgaagcgatt cacagatgtc 
tttctccaga agcgttaatg tctggcttct 
ttcctgtttg gtcactgatg cctccgtgta 
accgatgaaa cgagagagga tgctcacgat 
actggaacgt tgtgagggta aacaactggc 
cactcagggt caatgccagc gcttcgttaa 
gcagcatcct gcgatgcaga tccggaacat 
cagactttac gaaacacgga aaccgaagac 



atagacagat cgctgagata ggtgcctcac 8640 
tttactcata tatactttag attgatttaa 8700 
tgaagatcct ttttgataat ctcatgacca 8760 
gagcgtcaga ccccgtagaa aagatcaaag 8820 
taatctgctg cttgcaaaca aaaaaaccac 8880 
aagagctacc aactcttttt ccgaaggtaa 8940 
ctgtccttct agtgtagccg tagttaggcc 9000 
catacctcgc tctgctaatc ctgttaccag 9060 
ttaccgggtt ggactcaaga cgatagttac 9120 
ggggttcgtg cacacagccc agcttggagc 9180 
agcgtgagct atgagaaagc gccacgcttc 9240 
taagcggcag ggtcggaaca ggagagcgca 9300 
atctttatag tcctgtcggg tttcgccacc 9360 
cgtcaggggg gcggagccta tggaaaaacg 9420 
ccttttgctg gccttttgct cacatgttct 9480 
accgtattac cgcctttgag tgagctgata 9540 
gcgagtcagt gagcgaggaa gcggaagagc 9600 
tgtgcggtat ttcacaccgc atatggtgca 9660 
agttaagcca gtatacactc cgctatcgct 9720 
cccgccaaca cccgctgacg cgccctgacg 9780 
acaagctgtg accgtctccg ggagctgcat 9840 
acgcgcgagg cagctgcggt aaagctcatc 9900 
tgcctgttca tccgcgtcca gctcgttgag 9960 
gataaagcgg gccatgttaa gggcggtttt 10020 
agggggattt ctgttcatgg gggtaatgat 10080 
acgggttact gatgatgaac atgcccggtt 10140 
ggtatggatg cggcgggacc agagaaaaat 10200 
tacagatgta ggtgttccac agggtagcca 10260 
aatggtgcag ggcgctgact tccgcgtttc 10320 
cattcatgtt gttgctcagg tcgcagacgt 10380 
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tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta gccgggtcct 


caacgacagg agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg ctgctggaga 


tggcggacgc 


10560 


gaCggatatg 


ttctgccaag ggttggtttg 


cgcattcaca gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgQ 


tgaatccgtt 


agcgaggtgc cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc gatcatcgtc 


gcgctccagc gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca gtctgtatct 


agagcgttga atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg aaaccatcca 


cttcacgaga 


12180 
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ctgatctcct ctgccggaac accgggcatc tccaacttat aagttggaga aataagagaa 12240 

tttcagattg agagaatgaa aaaaaaaaac ccttagttca taggtccatt ctcttagcgc 12300 

aactacagag aacaggggca caaacaggca aaaaacgggc acaacctcaa tggagtgatg 12360 

caacctgcct ggagtaaatg atgacacaag gcaattgacc cacgcatgta tctatctcat 12420 

tttcttacac cttctattac cttctgctct ctctgatttg gaaaaagctg aaaaaaaagg 12480 

ttgaaaccag ttccctgaaa ttattcccct acttgactaa taagtatata aagacggtag 12540 

gtattgattg taattctgta aatctatttc ttaaacttct taaattctac ttttatagtt 12600 

agtctttttt ttagttttaa aacaccaaga acttagtttc gaataaacac acataaacaa 12660 

acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala. Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tct gtt get gca aca ctg ggc ttt ggt get tae 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tee aag get eat ggg ate gat cet aac ate agg aee ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att acc act ggc age ccc ate aeg tac tec acc tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gee gac ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Se^ Gly Gly Ala Tyr Asp lie lie lie Cys 
60 65 70 75 

gac gag tgc cae tec aeg gat gee aca tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act geg ggg gcg aga ctg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

acc gcc acc cet ccg ggc tec gtc act gtg ccc cat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tee ace ace gga gag ate cet ttt tae ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gee gca aag ctg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 
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ate aat gcc gtg gcc tac tac cgc ggt ctt gac gtg tec gte ate eeg 13239 
lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro 
175 180 185 

ace age ggc gat gtt gte gtc gtg gca ace gat gee cte atg aee ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat ace ggc gac ttc gac teg gtg ata gac tgc aat aeg tgt gte aee 13335 
Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ctt gac cet ace ttc ace att gag aea ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie 
220 225 230 235 

aeg etc ccc caa gat get gte tee cgc act caa egt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cea ggc ate tac aga ttt gtg gca eeg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ccc tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc aeg ccc gee gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga gcg tac atg aac ace eeg ggg ctt ccc gtg tgc cag gac eat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc cte act cat ata gat gee cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aac ctt ect tac etg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gcc acc gtg tgc get agg get caa gcc cct ccc cca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc cte aag ccc acc cte eat 13815 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cca aca ccc etg eta tac aga ctg ggc get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He 
380 385 390 395 

ace etg aeg cac cea gtc ace aaa tac ate atg aca tgc atg teg gcc 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala 
400 405 410 
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gac ctg gag gtc gtc acg age acc tgg gtg etc gtt gge ggc gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gee geg tat tge ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
430 435 440 

ggc agg gtc gtc ttg tec ggg aag ccg gca ate ata cet gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag tte gat gag atg gaa gag tgc tet cag cae tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gcc gag cag tte aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag acc geg tec cgt cag gca gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc tte tgg geg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aac tte ate agt ggg ata caa tac ttg geg ggc ttg tea 14295 
His Met Trp Asn Phe lie Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg cet ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age cca eta acc act age caa acc etc etc tte aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gcc gcc ccc ggt gee get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gee ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gac ate ctt gca ggg tat ggc geg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

geg gga get ctt gtg gca tte aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 
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etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gee tte gee 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ecc aeg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gca get gee ege gtc act gee ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg cga ctg cac cag tgg ata age teg gag tgt ace act eca 14871 
Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tee tgg eta agg gae ate tgg gae tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val 
735 740 745 

ttg age gae ttt aag ace tgg eta aaa get aag etc atg eca cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Net Pro Gin Leu 
750 755 760 

cet ggg ate ecc ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly lie Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

cga ggg gae ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie 
780 785 790 795 

act gga cat gtc aaa aac ggg aeg atg agg ate gtc ggt cet agg ace 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr 
800 605 810 

tgc agg aac atg tgg agt ggg acc ttc ecc att aat gee tac ace aeg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr 
815 820 825 

ggc ecc tgt ace ecc ett cet geg ccg aac tac aeg tte geg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gae ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tac gtg aeg ggt atg act act gae aat ett aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc eca teg ecc gaa ttt tte aca gaa ttg gae ggg gtg cgc eta eat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 
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agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 



15399 



aga gta gga etc eac gaa tac ccg gta ggg teg caa tta ect tgc gag 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 



15447 



CCC gaa ccg gae gtg gee gtg ttg acg tec atg etc act gat ccc tec 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 



15495 



cat ata aca gea gag gcg gee ggg cga agg ttg gcg agg gga tea ccc 
His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 



15543 



CCC tct gtg gee age tec teg get age cag eta tee get eca tct etc 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 



15591 



aag gea act tgc acc get aac cat gac tec ect gat get gag etc ata 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie 
975 980 985 



15639 



gag gee aac etc eta tgg agg cag gag atg gge ggc aac ate ace agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 



15687 



gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ett 
Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 



15735 



gtg gcg gag gag gac gag egg gag ate tec gta ccc gea gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu 
1020 1025 1030 1035 



15783 



egg aag tct egg aga ttc gee cag gee ctg ccc gtt tgg gcg egg ccg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 



gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gae tac gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 



15879 



eca ect gtg gtc cat ggc tgc ccg ett eca ect cca aag tec ect ect 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 



15927 



gtg ect ccg ect egg aag aag egg acg gtg gtc etc act gaa tea ace 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 



15975 



eta tct act gee ttg gee gag etc gee ace aga age ttt gge age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 



16023 



tea act tec ggc att acg gge gae aat acg aca aca tec tct gag ccc 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 



16071 
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gcc cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tec 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 



16119 



atg ecc ccc ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 



16167 



tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 



16215 



tgc tea atg tct tac tct tgg aca ggc gca etc gtc ace ccg tgc gee 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 



16263 



gcg gaa gaa cag aaa ctg ecc ate aat gca eta age aac teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



cgt cac cac aat ttg gtg tat tec ace ace tea cgc agt get tgc caa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tac cag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 



get aac ttg eta tec gta gag gaa get tgc age ctg acg ccc eca cac 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 



16503 



tea gcc aaa tec aag ttt ggt tat ggg gca aaa gac gtc cgt tgc cat 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 



16551 



gee aga aag gcc gta ace cac ate aac tec gtg tgg aaa gac ctt ctg 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gac aat gta aca cea ata gac act ace ate atg get aag aac gag 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag cca get cgt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg ttc ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gac gtg gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 
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gga ttc caa tac tea cca gga cag egg gtt gaa ttc etc gtg caa gcg 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 



16839 



tgg aag tec aag aaa aec eea atg ggg ttc teg tat gat ace egc tgc 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 



16887 



ttt gae tee aea gte act gag age gac ate egt acg gag gag gea ate 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
1405 1410 1415 



16935 



tac caa tgt tgt gac etc gac ccc caa gee cgc gtg gee ate aag tec 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 



16983 



etc ace gag agg ctt tat gtt ggg ggc cct ett ace aat tea agg ggg 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 



17031 



gag aac tgc ggc tat cgc agg tgc cgc gcg age ggc gta ctg aea act 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 



17079 



age tgt ggt aac aec etc act tgc tac ate aag gee egg gea gee tgt 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
1470 1475 1480 



17127 



ega gee gea ggg etc cag gac tgc ace atg etc gtg tgt ggc gac gac 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 



17175 



tta gte gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 



17223 



ctg aga gee ttc acg gag get atg aec agg tac tec gee ccc cct ggg 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 



17271 



gac ccc cca caa cca gaa tac gae ttg gag etc ata aea tea tgc tec 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1535 1540 1545 



17319 



tec aac gtg tea gtc gee cac gac ggc get gga aag agg gtc tac tac 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 



17367 



etc ace egt gac cct aca aec ccc etc gcg aga get gcg tgg gag aea 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 



17415 



gea aga cac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie He Met Phe 
1580 1585 1590 1595 



17463 



gee ccc aca ctg tgg gcg agg atg ata ctg atg ace cat ttc ttt age 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 



17511 
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gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate 17559 
Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct eca ate att 17607 
Tyr Gly Ala Cya Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc cat ggc etc age gea ttt tea etc cac agt tac tet cca 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 

ggt gaa ate aat agg gtg gee gea tgc etc aga aaa ctt ggg gta ccg 17703 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

ccc ttg cga get tgg aga cac egg gcc egg age gtc cge get agg ctt 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

ctg gee aga gga ggc agg get gee ata tgt ggc aag tac etc ttc aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gea gta aga aca aag etc aaa etc act cca ata gcg gcc get ggc 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gac ttg tec ggc tgg ttc aeg get ggc tac age ggg gga gac 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat cac age gtg tet cat gcc egg ccc cgc tgg ate tgg ttt tgc 17943 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gea ggg gta ggc ate tac etc etc ccc aac cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

atg age aeg aat cct aaa cct caa aga aag ace aaa cgt aac ace aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg ccg cag gac gtc aag ttc ccg ggt ggc ggt cag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ccg cgc agg ggc cct aga ttg ggt gtg cgc gcg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

aeg aga aag act tec gag egg teg caa cct cga ggt aga cgt cag cct 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ccc aag get cgt egg ccc gag ggc agg ace tgg get cag ccc ggg 18231 
He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 
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tac cct tgg ccc etc tat ggc aat gag ggc tgc ggg tgg gcg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tct ccc cgt ggc tct egg cct age tgg ggc ccc aca gac ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg cgt agg teg cgc aat ttg ggt aag gtc ate gat acc ctt acg tgc 18375 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
1885 1890 1895 

ggc ttc gee gac etc atg ggg tac ata ccg etc gtc ggc gcc cct ctt 18423 
Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
1900 1905 1910 1915 

gga ggc get gcc agg gcc taatagtcga ctttgttece actgtacttt 18471 
Gly Gly Ala Ala Arg Ala 
1920 



tagctcgtac 


aaaatacaat 


atacttttca 


tttctccgta 


aacaacatgt 


tttcccatgt 


18531 


aatatccttt 


tctatttttc 


gttccgttac 


caactttaca 


catactttat 


atagctattc 


18591 


acttctatac 


actaaaaaac 


taagacaatt 


ttaattttgc 


tgcctgccat 


atttcaattt 


18651 


gttataaatt 


cctataattt 


atcctattag 


tagctaaaaa 


aagatgaatg 


tgaatcgaat 


18711 


ectaagagaa 


ttggatctga 


tecacaggae 


gggtgtggte 


gccatgatcg 


cgtagtcgat 


18771 


agtggcteca 


agtagcgaag 


cgagcaggae 


tgggeggcgg 


ccaaagcggt 


cggacagtgc 


18831 


tccgagaacg 


ggtgcgcata 


gaaattgeat 


caacgcatat 


agcgctagca 


gcacgccata 


18891 


gtgactggeg 


atgctgtcgg 


aatggacgat 


atcccgcaag 


aggcccggca 


gtaccggcat 


18951 


aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


aggatgacga 


tgagegcatt 


19011 


gttagatttc 


atacacggtg 


cctgactgcg 


ttagcaattt 


aactgtgata 


aactaccgea 


19071 


ttaaagcttt 


ttctttccaa 


tttttttttt 


ttcgtcatta 


taaaaatcat 


tacgacegag 


19131 


attccegggt 


aataactgat 


ataattaaat 


tgaagctcta 


atttgtgagt 


ttagtataca 


19191 


tgcatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


catcttctca 


aatatgcttc 


19251 


ccagcctgct 


tttctgtaac 


gttcaccetc 


taccttagca 


tcccttccct 


ttgcaaatag 


19311 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


acatcatcca 


cggttctata 


19371 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


ccgggtgtca 


taatcaacca 


19431 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagea 


ataaagccga 


taacaaaate 


19491 


tttgtegctc 


ttcgcaatgt 


caacagtaee 


ettagtatat 


tctccagtag 


atagggagcc 


19551 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


tcctttgtta 


cttcttctgc 


19611 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca 


ccgtgtgcat 


tcgtaatgtc 


19671 
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tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


aatttgactg tattaccaat 


19731 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


ttggcggata 


atgcctttag 


19791 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


tccacatgtg 


tttttagtaa 


19851 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


tccttggtgg 


tacgaacatc 


19911 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


ttaaatagct 


tggcagcaac 


19971 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


ttcgacatga tttatcttcg 


20031 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


ac^tgggcaat 


ttcatgtttc 


20091 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


tgctccttcc 


ttcgttcttc 


20151 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga 


aaccgaaatc 


aaaaaaaaga 


20211 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 






20247 



<210> 19 
<211> 1921 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description o£ Artificial Sequence: 
pd . delta . NS3NS5 . p j . corelSO 

<400> 19 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
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145 



150 



155 



160 



Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 

370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 



Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 
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Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 



Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 
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Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Net Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 X015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 
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Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 

1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 
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Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 
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Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
905 1910 1915 1920 

Ala 
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